The Naytra Architecture

By Ravi EvaniFiled under NaytraLeave a Comment

The Naytra architecture : scalable for recommendation systems

The third area we need to look at as part of the Data Science behind Naytra is the technology architecture and infrastructure to support the model generation. We will also see a scalable mechanism to respond in real-time to potentially a large number of requests from brands that want to know the relevance prediction of their marketing messages to their target consumers.

The whole Naytra infrastructure runs on the cloud and the computation is split into offline, nearline and online computation layers. This concept is adopted from the Netflix architecture.

The Offline Layer

The offline layer holds all the data in a NoSQL data store, and much of the computation such as model generation runs in this layer. Offline computation has fewer limitations on the amount of data and the computational complexity of the algorithms since it runs in a batch manner with relaxed timing requirements. However, it can easily grow stale between updates because the most recent data is not incorporated. Since the models are run over a large amount of data it can be beneficial to run them in a distributed fashion, which makes them very good candidates for running on Hadoop via either Hive or Pig jobs.

Additionally, the job execution itself involves taking huge batches of data and performing the same operations over and over very quickly, so they are much better scaled with massive parallelization of the tasks over Graphics processors or GPUs than multi core CPUs. There are a number of new efforts underway to build new types of hardware for machine learning, such as Google’s TPUs or Tensor Processing Units specifically adapted for neural computations.

The Nearline Layer

The nearline layer is responsible for model updates from recent information signals where it’s not possible to regenerate the entire model in a near real-time fashion. One of the key issues is to combine and manage offline and online computation in a seamless manner.

Nearline computation is an intermediate compromise between these two modes in which we can perform online-like computations but do not require them to be served real-time. So this layer is responsible for any statistically significant updates triggered through recent user behavior that could affect the model. A high-speed in-memory data store is used for this purpose, along with the NoSQL data store.

The Online Layer

Finally, the online layer is what executes the model. This layer can respond better to recent events, but has to respond to requests in real-time. This can limit the computational complexity of the algorithms that can be employed as well as the amount of data that can be processed. This will be a typical web and app server stack to perform computation and responds to requests through REST APIs.

Conclusion

Envisioning the Naytra technology architecture requires us to think about the ability to use sophisticated machine learning algorithms that can grow to arbitrary complexity and can deal with large amounts of data. We want the relevance prediction results to be fresh and responsive to new pieces of a consumers digital exhaust. In addition to this core stack, we need the architecture to leverage external machine learning APIs available on the cloud such as image recognition and speech APIs that are available today from Google, Microsoft, IBM, etc.

Leave a Reply

Your email address will not be published. Required fields are marked *