Tuesday 31 May 2016

Building High Performance Big Data and Analytics Systems

Introduction

       Big Data and Analytics systems are fast emerging as one of the critical system in an organization’s IT environment. But with such huge amount of data, comes performance challenges. If Big Data systems cannot be used to make or forecast critical business decisions, or provide the insights into business values, hidden under huge amount of data, at the right time, then these systems lose their relevance. This blog post talks about some of the critical performance considerations, in a technology agnostic way. It talks about some techniques or guidelines, which can be used, during different phases of a big data system (i.e. data extraction, data cleansing, processing, storage as well as presentation). This should act as generic guidelines, which can be used by any Big Data professional to ensure that the final system meets the performance requirements of the system.

What is Big Data

          Big Data is one of the most common terms in IT world these days. Though different terms and definitions are used to explain Big Data, in principal, all conclude to the same point that with the generation of huge amount of data, both from structured and unstructured sources, traditional approaches to handle and process this data are not sufficient.

Big Data systems are generally considered to have five main characteristics of data, commonly called 5 Vs of data. These are Volume, Variety and Velocity, Veracity and Value.

According to Gartner, High Volume can be defined as “Bigdata is high volume when the processing capacity of the native data-capture technology and processes is insufficient for delivering business value to subsequent use cases. High volume also occurs when the existing technology was specifically engineered for addressing such volumes – a successful big data solution”.

Building Blocks of a Big Data System

         A Big Data system compromises of a number of functional blocks that provide the system the capability for data acquisition from diverse sources, doing pre-processing (e.g. cleansing, validation) etc. on this data, storing the data, doing processing and analytics on this stored data (e.g. doing predictive analytics, generating recommendations for online uses and so on), and finally presenting and visualizing the summarized and aggregated results.

The following figure depicts these high level components of Big Data system

Big Data System


Data Processing and Analysis

     Once the cleansed and de-deduped, the pre-processed data is available for doing the final processing and applying required analytical functions. Some of the steps involved in this stage are, de-normalization of the cleansed data, performing some sort of correlation amongst different set of data, aggregating the results based on some pre-defined time intervals, performing ML algorithms, doing predictive analytics, and so on.

In the following sections, this blog will present some of the best practices for carrying out data processing and analysis, to achieve better performance in a Big Data System.

Visualization and Presentation

The last step of a Big Data flow is to view the output of different analytical functions. This step involves reading from the pre-computed aggregated results (or other such entities) and presenting in the form of user friendly tables, charts, and other such methods, which makes it easy to interpret and understand the results.



Data Acquisition



Saturday 28 May 2016

Cloud and Web hosting



 Cloud and Web Hosting

Frequently I stumble with customers that are torn, or don’t know whether to host their project on the cloud or with a web hosting. So with the objective of clarifying which path to follow, lets compare those two worlds. The following will try to help non technical people and developers alike, while they don’t speak the same language, they both need each other, especially when starting a new project.

The Cloud

While the cloud is very popular today, let start by defining it, which is quite simple. In the beginning on the networking world, every other equipment that it wasn’t in your premises was pictured -literally- as a cloud. Signifying that you didn’t care how communications went there, you just cared the info (network) was there and it went where it was supposed to be sent. As such, the internet world stole the cloud concept and now it uses it to describe the same objective: You -as a user- don’t care the technology or platform behind it, you just send it there and benefit from what it offers.

Web hosting

In the begging on the internet, a server was an expensive computer. Since that time, a server could host multiple web pages from multiple accounts. That’s the concept behind web hosting: A company or a user leases a space in that server in order for their webpage to be hosted there. It is a shared space, with the advantage of lowering cost and simplifying maintenance. The user/company of that hosting space doesn’t have to worry about maintaining the machine, they just use it and that’s it! On the down side, because it is a shared machine, there are constraints on what it can be deployed over this server. Those constraints might be security related, an specific functionality or feature not available, or simple load times.



 The cloud today

         First servers became cheaper and cheaper, second companies like Google, Amazon and Microsoft needed to have thousands of servers in order to run their business. However, because those servers were there in order to handle peak loads, what about the time of valleys in load? Thus, the cloud was born. Amazon started renting that computing power it had to spare, while their machines were almost idle. And because it was a gigantic amount machines, renting those use was really cheap. That’s the chore of what is a cloud service today, and because it represented a business in itself, now that extra capacity isn’t just shared, it is the objective. Amazon store back-end became Amazon Web Services, Google search engine massive deployment paved the way for Google Cloud Engine, and Hotmail and Bing help the inception of Office 365 and Azure.

Challenge
Cloud
Web hosting
Setup
Easy
Easy
Deployment
Easy
Easy
Maintenance
Depends on the service being used in the cloud provider
Easy
Security
Hard
Easier
Scalability
Possible
Impossible
Initial cost
More expensive
Low cost
Dependence
Code/setup will be married to the exact cloud provider. Moving out of it is neither cheap nor easy
Code/setup is independent of the provider. Moving out is a non-issue
Mobile services Integration
Easy
Non-existent
Coding (feature) flexibility
Complete! Do what you want!
Limited to what the web hosting allows

 Nuances

          As always there are nuances, we expand a little on their differences.

Maintenance: With web hosting, maintenance is straightforward as you only maintain the code of the app, server patches (including security) are out of the customer (business owner) jurisdiction. So no worries about that. For the cloud, maintenance will be directly linked to which service of the cloud the app/website is using. If the app is using Software As A Service (SaaS), makes maintenance easier, as it is a service similar as Web Hosting, only with steroids. On the contrary, if the app is deployed by using virtual machines on the cloud, all the security (hardening) and server patches will be a direct responsibility of the business, not the cloud provider.

Scalability: This is tricky, rule of thumb: It’s better to start small and later grow. The development team can make an app that can scale, but this will come out as cost of development. The app is beautifully coded to allow heavy load, however the business is not bringing that load, then the business will end up paying for an app that was over dimensioned. While the risk of under dimension always exist, it is better -business wise- to build an small app, which will have lower developing cost and if the case comes that it needs to handle more load, well, that’s a great problem to have! Ask the Twitter whale about it! Besides, thanks to the enormous amount of computing power available today, an small server or a single web hosting account might be able to handle that load easily. It all depends on the app functionality.

Initial Cost: Similar to scalability, using web hosting is by far the lowest cost of deployment, as long as the app requirements allows the use of shared hosting. Not all of them allow that.

      Today many developers will default on using the cloud first, even though they’re making an small project that easily could be hosted on a shared hosting. Once delivered, then it becomes an operational cycle for the business, but the business ignored these challenges from the beginning.

      One last thing, a website is not necessarily an app. If all that you need is a website, then a web hosting will do the job with the lowest cost and complications. We in Mollejuo apply a hybrid approach: Our website is being hosted on a shared account, with e-mails. And our back-end is hosted in the cloud. By doing that we have the best of both worlds.

Thursday 26 May 2016

Computer Invention


Cloud Computing

     Cloud computing is the term used to describe technology that will replace how we currently use our computers and software applications.

    The concept of cloud computing is that any device (computer invention) connected to the internet can utilize a network of computing resources.


   This would include infrastructure, applications and storage for far less than what it would cost to use your own hardware, software and resources.

    Additionally, it allows users to have access to applications that they would not have otherwise. Access is as simple as using an interface application or just a web browser from any location.

  The cloud can allow access to millions of computers in an intelligent, scalable and redundant system with expert support.

   Similar to outsourcing, the difference and advantages of cloud computing are speed, efficiency, capability and cost, particularly with the increasing popularity of smart phones and tablets.
It allows users to work from anywhere, to perform any task with any application, and to pay for only what you use.

   Cloud computing is comparable to using email or online banking where you log into your account to access and manage your information. The software, applications and storage do not exist on your computer.


  But unlike your email or online banking services, clouds can perform complicated engineering tasks, schematics, modeling or mathematical computations, and it can do this in a cost effective and efficient manner.

  Another major advantage to cloud computing is the reliability of service.
Servers can crash, temporarily denying you online access to services and data. But cloud computing has multiple servers so you always have access.

  Some concerns about cloud computing are security and the effect it will have on the computer industry.

   The security concerns are similar to those regarding email and online banking.


  These have proven to have reliable security protocols therefore similar advanced technology has been adopted for cloud computing.

Wednesday 25 May 2016

CRM App

Microsoft Announces Dynamics 

CRM App For Outlook


         The CRM app for Outlook has been introduced in the CEM 2015 Update 1 as a preview feature and it is completely supported with CRM Online 2016. The CRM On Preview Support will be made available with CRM 2016 Update 1.0.
Microsoft Dynamics CRM App for Outlook is a lightweight app, which you will easily be able to use in order to view Microsoft Dynamics CRM information and also track email from within Outlook. The CRM data will appear right in your Outlook Inbox.

  The CRM App for Outlook is now available in Outlook Web Access (OWA) too, the Desktop Outlook (Laptop or PC) and also as a Mobile App (for Tablets and Smartphones).

  According to the official blog, the Dynamics CRM App for Outlook allows you conduct the following features without the need to configure or install.
  • Work in the familiar Outlook environment,
  • Easily track/ untrack emails wherever you are
  • Convert an email message into a new CRM Record.
  • Create CRM contact records for people on the from list that aren’t already included in the CRM database.
  • Preview information about contacts and leads stored in CRM.
  • Open CRM records directly to find or enter more detailed information.
  • Create new CRM records for any entity, as long as the entity has been enabled for mobile (*) and for multi-entity search.
  Only entities enabled for mobile are visible in the CRM App for Outlook. To enable an entity for mobile you need to access the specific entity under Settings -> Customization -> Customize the System and mark “enable for mobile.”
Eligibility Requirements & Deployment

  The CRM App for Outlook can be facilitated only for CRM Online 2016 or later deployments. Dynamics CRM App for Outlook now available for Microsoft Dynamics CRMon-premises is planned for CRM 2016 Update 1.0.

Mailbox Settings

  It is necessary to ensure that the user’s mailbox is configured to synchronize incoming emails through Server side Sync.

User Settings

  User needs to have a minimum set of privileges so as to qualify as an eligible user for the CRM App for Outlook. The minimum level of permissions is described in the picture:

 If the above requirements are met, users should be eligible to use the App.

  Users can also add the App to their Outlook by accessing Personal Options, Apps for Dynamics CRM.











Tuesday 24 May 2016

Dell Monitor


Dell Introduces 43-inch 4K Multi-client Monitor


Dell has announced a new monitor and the specifications are quite interesting. Dell P4317Q has a 43-inch 4K display with an option to run as four separate 1080p screens
K large screen

The 43-inch display with high-performance monitor scalar offers exceptional clarity up to Ultra HD 4K.

Multi-client

Amazingly, you can connect up to four independent clients to a single monitor with customized views. This is done without any bezel breaks and makes tasks easier.

Multi-monitor setups are easier

Single mount setup with fewer cables and RS232 connection offers easy manageability.

One setup, four inputs

The monitor can show content from four separate inputs simultaneously in full HD. It comes with the following,
  • four USB 3.0,
  • two HDMI,
  • one DisplayPort,
  • one Mini DisplayPort,
  • one VGA port


Other features

It saves 30 percent in energy consumption. Also, you can zoom in to any single display to take the benefit of 4K display.

Monday 23 May 2016

Latest Mouse Technology


LatestMouse






Feature List in C# 7.0

Feature List in C# 7.0
  1. Local functions – code available currently in github
  2. Tuple Types and literals
  3. Record Types
  4. Pattern matching
  5. Non Nullable reference types
  6. Immutable types
  7.  Local Functions
  1. Up to C# 6.0 Ability to declare methods and types in block scope as like variables. It is already possible in current version of C# using Funtion c and Action types with anonymous methods, but they lack these features.
  2. Generics
  3. Ref and out
  4. Params

We can’t utilize these three features while using Func and Action. 
In C# 7.0


Local functions would have the same capabilities as normal methods but they can be only accessed within the block they were declared in.

public int Foo(int someInput)  
{  
     int Bar()  
 {  
      Console.WriteLine(“inner function”);  
     }  
    return Bar();  
}  


Evolution
 Advantages over lambda expressions
  • The syntax is similar and more consistent with the methods.

  • Recursion and forward references would work for the local functions but not for the lambda.

  • The lambda causes one or two memory allocations (the frame object for holding the variables inside that function and a delegate); but local function causes no memory allocations.

  • A local function can be generic.

  • It can accept ref and out parameters.

  • The direct method invocation is indeed faster than a delegate invocation.
Tuple Types and literals


Multiple return types – up to C# 6.0

In our current version of C# for returning multiple values from a method we have follow on of these methods.

  1. Out parameters
  2. Tuple-Types
  3. Class/ Struct
The most common reason for grouping up of temporary variables is to return multiple values from a method. 


Out parameters

public void GetMultipleValues(string deptName, out double topGrossSalary, out string hrName) { ... }  double gross,   
string hrName;  
GetMultipleValues (name, out gross, out hrName);  
Console.WriteLine($"Gross: { gross }, Hr: { hrName }");   
The disadvantage of using out parameter is we can’t use it for async methods. And you have declare parameters upfront and you have specify the specific type. 


Tuple-Types



Currently, C# has tuple type in order to hold multiple non related values together. We can rewrite the same method to use tuples to achieve the same functionality.
public Tuple<double, string> GetMultipleValues(string name) { ... }  
  
var tupleSalary = GetMultipleValues (name);  
Console.WriteLine($"Gross: { tupleSalary.Item1 }, Hr: { tupleSalary.Item2 }");   
This does not have the disadvantages of out-parameters, but the resulting code is rather obscure with the resulting tuple having property names like Item1 and Item2.


Class / struct



You could also declare a new type and use that as the return type.
Struct TopSalaryAndHr { public double topGrossSalary; public string hrName;}  
public TopSalaryAndHr GetMultipleValues(string name) { ... }  
  
var tupleSalary = GetMultipleValues (name);  
Console.WriteLine($"Gross: { tupleSalary.topGrossSalary }, HR: { tupleSalary.hrName }");   
This has none of the disadvantages of out-parameters and the tuple type, but it’s rather verbose and it is meaningless overhead. Because, the properties inside a (class / struct) are not coming under one roof and not having any common base characteristics. For the purpose of creating temporary data structure C# has provided Tuples.


All three ways mentioned above has their own disadvantages, so they want to overcome these shortcomings by introducing a miracle. 



Multiple return types in C# 7.0


Tuple return types:

You can specify multiple return types for a function, in much the same syntax as you do for specifying multiple input types. These are supposed to be called Tuple Types.

Public (double topGrossSalary,string hrName) GetMultipleValues(string name) {……….. }  
The syntax (double topGrossSalary,string hrName) indicates an anonymous struct type with public fields of the given names and types. Note that this is different from some notions of tuple, where the members are not given names but only positions (i.e Tuple<double, string>). This is a common complaint, though, essentially degrading the consumption scenario to that of System.Tuple above. For full usefulness, tuples members need to have names. This is also fully compatible with async.
public async Task<(double topGrossSalary, string hrName)> GetMultipleValues Async(string name) { ... }  
  
var t = await GetMultipleValues (myValues);  
Console.WriteLine($"Sum: {t.sum}, count: {t.count}");   
Tuple literals



Tuple values could be created as,
var t = new (int sum, int count) { sum = 0, count = 0 };  
Creating a tuple value of a known target type, should enable leaving out the member names.
public (int sum, int count) Tally(IEnumerable<int> values)   
{  
var s = 0; var c = 0;  
foreach (var value in values) { s += value; c++; }  
return (s, c); // target typed to (int sum, int count)  
}  
In the above example we have created a method with return type of tuple type with known target type and constructed a tuple type inline using the previously declared individual variables (i.e. return(s, c)).


Using named arguments as a syntax analogy it may also be possible to give the names of the tuple fields directly in the literal:
public (int sum, int count) Tally(IEnumerable<int> values)   
{  
var res = (sum: 0, count: 0); // infer tuple type from names and values  
foreach (var value in values) { res.sum += value; res.count++; }  
return res;  
}  
Tuple deconstruction


We don’t need to the tuple object as a whole because it doesn’t represent a particular entity or a thing, so the consumer of a tuple type doesn’t want to access the tuple itself, and instead he can access the internal values of the tuple. 



Instead of accessing the tuple properties as in the example of Tuple Return Types, you can also de-structure the tuple immediately:
(var sal, var hrName) = GetMultipleValues("some address");  
Console.WriteLine($"Salary: { sal }, Hr Name: {hrName}");  
Pattern Matching


Is Expression

The “is” operator can be used to test an expression against a pattern. As part of the pattern-matching feature repurposing the “is” operator to take a pattern on the right-hand-side.

relational_expression : relational_expression 'is' pattern;  
It is a compile-time error if the relational_expression to the left of the “is” token does not designate a value or does not have a type. Every identifier of the pattern introduces a new local variable that is definitely assigned after the “is” operator is true (i.e. definitely assigned when true).


Pattern

Patterns are used in the is operator and in a switch_statement to express the shape of data against which incoming data is to be compared.



There are many areas where we can use patterns in c#. You can do pattern matching on any data type, even your own, whereas if/else you always need primitives to match. Pattern matching can extract values from your expression.



For ex: I am having handful of types 
class Person(string Name);  
  
class Student(string Name, double Gpa) : Person(Name);  
  
class Teacher(string Name, string Subject) : Person(Name);  
  
//This sample uses the latest c# feature record type to create objects  
I wanted to perform some operations by type specific.
static string PrintedForm(Person p)  
{  
Student s;  
Teacher t;  
if ((s = p as Student) != null && s.Gpa > 3.5)  
{  
return $"Honor Student {s.Name} ({s.Gpa})";  
}  
else if (s != null)  
{  
return $"Student {s.Name} ({s.Gpa})";  
}  
else if ((t = p as Teacher) != null)  
{  
return $"Teacher {t.Name} of {t.Subject}";  
}  
else  
{  
return $"Person {p.Name}";  
}  
}  
Below is the client application which consumes the printed form function,
static void Main(string[] args)  
{  
Person[] oa = {  
new Student("Einstein", 4.0),  
new Student("Elvis", 3.0),  
new Student("Poindexter", 3.2),  
new Teacher("Feynmann", "Physics"),  
new Person("Anders"),  
};  
foreach (var o in oa)  
{  
Console.WriteLine(PrintedForm(o));  
}  
Console.ReadKey();  
}  
In the above sample for holding the objects, I need to create temporary variables s and t. So if I have n number of objects I need to create N temporary variables with distinct names unnecessarily, which make my code more verbose. And also the temporary variables are only need for a particular code block but it is having scope throughout the function. Note the need to declare variables s and t ahead of time even though it is used in one of the code blocks. 


As part of the pattern-matching feature we are repurposing the “is” operator to take a pattern on the right-hand-side. And one kind of pattern is a variable declaration. That allows us to simplify the code like this,
static string PrintedForm(Person p)  
{  
if (p is Student s && s.Gpa > 3.5) //!  
{  
return $"Honor Student {s.Name} ({s.Gpa})";  
}  
else if (p is Student s)  
{  
return $"Student {s.Name} ({s.Gpa})";  
}  
else if (p is Teacher t)  
{  
return $"Teacher {t.Name} of {t.Subject}";  
}  
else  
{  
return $"Person {p.Name}";  
}  
}  
Now you can see that the temporary variables s and t are only declared and scoped just to the place they need to be. The switch statement is also repurposed like the case branches can also have patterns instead of just constants.
static string PrintedForm(Person p)  
{  
switch (p) //!  
{  
case Student s when s.Gpa > 3.5 :  
return $"Honor Student {s.Name} ({s.Gpa})";  
case Student s :  
return $"Student {s.Name} ({s.Gpa})";  
case Teacher t :  
return $"Teacher {t.Name} of {t.Subject}";  
default :  
return $"Person {p.Name}";  
}  
}  
You can see in the above example that the case statements with patterns. Please note the new when key word in switch statement.


Record Types

Record Types is concept used for creating a type with only properties. By using that we can embed the constructor declaration with the class declaration. 



For ex:


Class Student(string Name, int Age);  
This simple statement would automatically generate the following code inbuilt.
Class Student   
{  
string _name;  
int _age;  
  
public Person(string Name, int Age)  
{  
this.Name = Name;  
this.Age = Age;  
}  
public string Name {get{ return this._name;}}  
public int Age {get{ return this._age;}}  
}   
  • Read-only properties, thus creating it as immutable type.

  • The class will automatically implement Equality implementations like (such as GetHashCode, Equals, operator ==, operator != and so forth).

  • A default implementation of ToString() method.
Non-Nullable reference types:


Non- nullable reference option will let you create a reference type that is guaranteed not to be null. NullReference expections are too common in a project. Often we developers forgot to check a reference type for null before accessing the properties of it, thus paving way to problems.


Either we forget check for it making our code vulnerable to runtime exceptions or we will check for it which makes our code more verbose.



Instead of using the “?” for identifying the nullable value type we are going to use “!”.The currently proposed syntax is as follows:
int a; //non-nullable value type  
int? b; //nullable value type  
string! c; //non-nullable reference type  
string d; //nullable reference type  
  
MyClass a; // Nullable reference type  
MyClass! b; // Non-nullable reference type  
  
a = null; // OK, this is nullable  
b = null; // Error, b is non-nullable  
b = a; // Error, a might be null, b can't be null  
  
WriteLine(b.ToString()); // OK, can't be null  
WriteLine(a.ToString()); // Warning! Could be null!  
  
if (a != null) { WriteLine(a.ToString); } // OK, you checked  
WriteLine(a!.Length); // Ok, if you say so  
  
It would be quite problematic using the same syntax for generic types and collections. For example  
// The Dictionary is non-nullable but string, List and MyClass aren't  
Dictionary<string, List<MyClass>>! myDict;   
  
// Proper way to declare all types as non-nullable  
Dictionary<string!, List<MyClass!>!>! myDict;  
For specifying all the arguments in a collection is non- nullable, there is a shortcut syntax has been proposed,
// Typing ! in front of the type arguments makes all types non-nullable  
Dictionary!<string, List<MyClass>> myDict;  
Immutable Types
Types of modelling in c#


An immutable object is an object whose state cannot be changed after its creation, which means Immutable objects are objects which once loaded cannot be changed / modified by any way external or internal.



Immutable objects offer few benefits, 
  • Inherently thread-safe.
  • Easier to parallelize.
  • Makes it easier to use and reason about code.
  • Reference to immutable objects can be cached, as they won’t change.
Currently it is also possible to create immutable classes. Create a class with properties only with get and read-only and constant private variables.
Public class Point  
{  
public Point(int x, int y)  
{  
x = x;  
Y = y;  
}  
public int X { get; }  
public int Y { get; }  
}  
Code in the above example is definitely an immutable class, but the intent of the class is not clearly stated as immutable. Because, in future any one can add setters to any of the properties thus making it as mutable. 


The proposed syntax for creating an immutable class will force the developer to strictly adhere the rules hence will make the class an immutable. Below is the proposed syntax,
public immutable class Point  
{  
public Point(int x, int y)  
{  
x = x;  
Y = y;  
}  
  
public int X { get; }  
public int Y { get; }  
}