Conversational Commerce using Microsoft Bot Framework

I just posted a conversational commerce bot I built for prototype using Microsoft Bot Framework over at my Github. It uses LUIS for natural language processing and makes use of Bot State Service. Detailed tutorial coming soon. But if you can’t wait, head on over to my github page, download the source code and start playing with it.

sampleemulatorchat1

Conversational Commerce using Microsoft Bot Framework

First GitHub Project – MVC Web API using Google OAuth

I finally ended up activating my GitHub account that I signed up for years ago, and publish my first GitHub project. It is a MVC Web API project that uses Google OAuth for authentication. You can see it here.

Even though a lot of the MVC project templates supposedly come with external authentication fully configured, in practice I found that to be quite untrue (other than the single page app template). I could only get this to work after pulling information from several different blogs. I started off with an empty Web API project and only build pieces I needed as I went.

The other thing I was trying to do was send POST messages using Postman to google for authentication, and that did not work at all for me. I did get an authentication token back from Google, but I kept getting 401 when trying to make a POST call to the authorized function. Possible reasons could be, misconfiguration of Postman (even though I did verify everywhere else, and unauthorized GET was working fine), Google blocking it because it wants a consensual mouse click authorizing app access, or something else. I gave up on Postman and made AJAX calls on a simple HTML page.

mvc-web-api-google-oauth

So, if you are looking to add external authentication into your .NET MVC Web API, maybe this project on GitHub would be of some use to you!

First GitHub Project – MVC Web API using Google OAuth

Show Current Git Branch in PowerShell

If you happen to switch between branches a lot, or find yourself doing git branch to see what branch you are currently on, you can easily modify your PowerShell profile to show your current active branch right in the prompt. Here is how.

Step 1

Open powershell profile for edit by entering “notepad $profile” in powershell. This will open your PowerShell profile located here: C:\Users\yourname\Documents\WindowsPowerShell

Step 2

Add the following function in the profile definition.

function prompt {
$host.ui.rawui.WindowTitle = $(get-location)

$prompt_string = “PS ” + $(get-location) + ” [”

if(Test-Path .git) {
git branch | foreach {
if ($_ -match “^\*(.*)”){
$prompt_string += $matches[1] + ” ]> ”
}
}
}
else{
$prompt_string = “PS> ”
}

Write-Host ($prompt_string) -nonewline -foregroundcolor yellow
return ” ”
}

Step 3

Save and re-open the powershell. Your prompt should now show as following in yellow with branch name showing in the square brackets:

PS C:\Projects\foo [ x.x.currentBranch ]>

Show Current Git Branch in PowerShell

ACID vs. BASE

Good article on ACID vs. BASE attributes of a database you can read here. To recap ACID means 

Atomic: Everything in a transaction succeeds or the entire transaction is rolled back.

Consistent: A transaction cannot leave the database in an inconsistent state.

Isolated: Transactions cannot interfere with each other.

Durable: Completed transactions persist, even when servers restart etc.

And BASE refers to: 

Basic Availability

Soft-state

Eventual consistency

ACID vs. BASE

When and when not to use NoSQL

I came across this great article on Microsoft Azure Docs on NoSQL vs. SQL. In the development world, there are new technologies coming down like rain every day. It is easy to get caught up in the latest and biggest trend and have a tendency to replace your current favorite technology (a hammer) for any problem (a nail) with a different technology (a different hammer). It is important to not lose focus of what are the true applications of any new technology, and when to use it or not use it.

The Microsoft article example gives a great example on a social site where you may have a user making a post with different media that get comments and likes by other users. To think of it in a purely relational database sense, you may end up creating different tables to host users, posts, media types, comments, etc. with one-to-many or many-to-many relationships going every which way. And to do something simple like showing a post from a user may require you to run joins on several of these tables. Definitely not great for performance.

nosql-vs-sql-social

In comparison, in a document based NoSQL database, you could have entire documents saved with all the relevant information for a particular post, assigned to a user. It would be very performant unlike the multi-table, multi-relationship joins all over solution an RDBMS would offer.

nosql

There are things that relational databases are good at, for instance

  • Relational Queries
  • Defined and uniform table structure (all entries have same fields)
  • Well Defined Schema (though adding properties requires more work)
  • Structured Data
  • Vertical Scaling (More RAM, More Processing Power)

and there are things NoSQL storage is good at, for instance

  • Non-relational data (JSON, key-value pairs, etc.)
  • Ease of adding new properties
  • Unstructured data
  • Availability of Consistency (CAP Theorem)
  • Horizontal Scaling (Add Servers)
When and when not to use NoSQL

Clean Architecture of Microservices

I came across this article from Robert C. Martin (Uncle Bob) about Microservices architecture. In it he argues that any architecture whether microservices or traditional 3-tier or anything in between, needs to be designed to be decoupled from deployment strategy. He argues that the code should be unaware of whether it will eventually be deployed as microservices or DLLs or anything else. And that you shouldn’t bake deployment strategy into your code. Interesting article for sure. You can read the complete article on his site here

Clean Architecture of Microservices

Google’s Spanner – Holy Grail of DBs or not?

Google just made it’s internal DB called Spanner open to public via it’s cloud offerings couple of days ago, and it’s already being touted as somewhat of a game changer. But is it really?

apple-icon

CAP Theorem

So basically there is this term CAP, often referred to as the CAP Theorem, that is an acronym for Consistency, Availability and Partition Tolerance. Consistency refers to the idea that all data in every node and cluster should have the same value at a given point in time. Availability signals at 100% uptime for both read and write executions. And partition tolerance refers to whether the database continues to function correctly if communication between servers is interrupted for some reason. Now, CAP Theorem says you can have only two of the three, and must sacrifice the third. Basically, you can either have CA, CP or AP. But not all three simultaneously.

It’s always been about A

Now, the person who initially coined the CAP Theorem was Eric Brewer of Google. He just wrote an article yesterday on valentines (a true romantic) where he claims that it’s always been about A. That is that 100% availability has always been the most important of the trinity. You can live with outdated data, as long as some data, even if its not the most recent, returns successfully.

How Google Beat Time

In a truly distributed database, where you have data centers strewn across the world, having real time or near real time consistency has been an issue. The reason Spanner is making waves the last few days is basically due to the claim that Google has been able to somehow bend time. How have they done that? Basically by developing an advanced and sophisticated timekeeping mechanism. It uses GPS receivers and atomic clocks to keep its own track of time rather than depending on NTP. Google calls this TrueTime. A key factor in achieving this hyper accuracy is the fact that Spanner runs on Google’s private network. Google not only has a global footprint like no other company, but also runs and controls its own WAN.

RDBMS vs. NoSQL vs. Spanner

Typically relational databases (RDBMS) like SQL Server, Oracle, MySQL, etc. scale-up. That is you can throw more RAM and processing power at them. Problem is at one point, you reach a limit. NoSQL databases get around this by scaling-out i.e. adding more servers or nodes. Problem with that then becomes synchronization and consistency. So NoSQL databases like Cassandra have specialized replication algorithms where nodes send each other updates to keep data fresh and synchronized between updates. Well, Spanner basically brings the relational quality of RDBMS with the distributed architecture of the NoSQL database. In Brewer’s own words:

Spanner is Google’s highly available, global SQL database. It manages replicated data at great scale, both in terms of size of data and volume of transactions. It assigns globally consistent real-time timestamps to every datum written to it, and clients can do globally consistent reads across the entire database without locking.

But is it really? Is it?

Strictly speaking, no you cannot have 100% availability. What Spanner claims you can have though is near 100% availability, with near consistency, while operating over a wide area network. But that near may be just good enough. Google claims that Spanner offers five 9s availability meaning less than 1 in 10^5 calls. That is good enough for a lot of businesses.

Is Spanner the DB Holy Grail?

I think that remains to be seen. What will make a difference is that now that near CAP is possible, do companies really need it? If you are a multinational running global operations, are you going to be ok with other NoSQL choices like MongoDB and Cassandra or even running local scaled-up RDBMS that are cut up by regions and business units? Do business really need all three tenants of CAP, or is it just a cool bit of technology.

Further Readings
Inside Cloud Spanner and the CAP Theorem
Why Google’s Spanner Database Won’t Do As Well As Its Clone
Google Launches Cloud Spanner — A NewSQL Database For Enterprises
CAP Confusion: Problems with ‘partition tolerance’
Google’s Spanner – Holy Grail of DBs or not?