Platforms 27. Arm in the cloud Trial Arm compute instances in the cloud have become increasingly popular over the past few years due to their cost and energy efficiency compared to traditional x86-based instances. Many cloud providers now offer Arm-based instances, including AWS, Azure and GCP. The cost benefits of running Arm in the cloud can be particularly beneficial for businesses that run large workloads or need to scale. Based on our experiences we recommend Arm compute instances for all workloads unless there are architecture-specific dependencies. The tooling to support multiple architectures, like multi-arch docker images, also simplify build and deploy workflows. 28. Ax Trial Faced with the challenge of exploring large configuration spaces, where it may take a significant amount of time to evaluate a given configuration, teams can turn to adaptive experimentation, a machine-guided, iterative process to find optimal solutions in a resource-efficient manner. Ax is a platform for managing and automating adaptive experiments, including machine learning experiments, A/B tests and simulations. Currently, it supports two optimization strategies: Bayesian optimization using BoTorch, which is built on top of PyTorch, and contextual bandits. Facebook, when releasing Ax and BoTorch, described use cases like increasing the efficiency of back-end infrastructure, tuning ranking models and optimizing hyperparameter search for a machine learning platform. We’ve had good experiences using Ax for a variety of use cases, and while tools for hyperparameter tuning exist, we’re unaware of a platform that provides functionality in a scope similar to Ax. 29. DuckDB Trial DuckDB is an embedded, columnar database for data science and analytical workloads. Data analysts usually load the data locally in tools like pandas or data.table to quickly analyze patterns and form hypotheses before scaling the solution in the server. However, we’re now using DuckDB for such use cases, because it unlocks the potential to do larger than memory analysis. DuckDB supports range joins, vectorized execution and multiversion concurrency control (MVCC) for large transactions, and our teams are quite happy with it. 30. Feature Store Trial Any software system needs to properly represent the given domain in which it is employed and should always be informed by key aims and goals. Machine learning (ML) projects are no different. Feature engineering is a crucial aspect of engineering and designing ML software systems. Feature Store is a related architectural concept to facilitate the identification, discovery and monitoring of the features pertinent to the given domain or business problem. Implementing this concept involves a combination of architectural design, data engineering and infrastructure management to create a scalable, efficient and reliable ML system. From a tooling perspective, you can find open-source and fully managed platforms, but they’re only one part of this concept. In the end-to-end design of ML systems, implementing a feature store enables the following capabilities: the ability to (1) define the right © Thoughtworks, Inc. All Rights Reserved. 22
Immersive Experience — Vol 28 | Thoughtworks Technology Radar Page 21 Page 23