March 19, 2018 – Chinese daily newspaper, Reference News, reported on its official website that the internationally-renowned Forrester released its "Cloud Data Warehouse, Q1 2018" report. This report comprehensively evaluated the primary functions, regional performance, market segments, typical customers, and other features of big data service providers.
Forrester reports are extremely influential within the cloud industry and are often regarded as the guidebooks for CIOs of major international companies. Based on these criteria, Forrester elected four companies: AWS, Alibaba Cloud, Google, and Microsoft. Alibaba Cloud was the only Chinese tech company selected.
Evaluation Criteria
Cloud-based big data services have been in high demand in recent years due to the advantages of security, elastic scalability, rapid deployment, and low costs. Conversely, locally deployed big data analytic solutions are gradually becoming obsolete. In its evaluation, Forrester required each supplier to meet the following criteria:
1) Sophisticated big data warehouse products
2) Independent big data warehouse solutions
3) Big data use cases
4) Publicly available products
5) A leading position in regional markets
6) Advanced technology
As the only selected Chinese product, MaxCompute received a detailed analysis in the Forrester report. In the following sections, we will be sharing the journey of the Alibaba Cloud big data processing service, MaxCompute.
Evolution of MaxCompute
In 2009, Alibaba reached the Greenplum ceiling. It was difficult to scale up Greenplum beyond one hundred hosts and 1000 TB. However, even such maximum capacity was far from enough to support a thriving business like Alibaba Cloud.
In September 2009, Alibaba Cloud launched R&D on its Apsara big data platform. Their aim was to create the self-developed data warehouse MaxCompute for data volumes measured in exabytes. It was a slow but enlightening journey for the team at Alibaba Cloud. It is only after 8 years that the team could successfully deploy a global network of clusters, each with over 10,000 servers!
Global Reach
Due to the advantages of security, elastic scalability, rapid deployment, and low costs, the cloud-based big data services have been in high demand in recent years. Last year, Forrester reported that, although cloud data warehouse (CDW) enterprises provided excellent cloud-based features, many cloud companies exhibited deficiencies in areas such as global deployment, data security, integration, modeling, and governance.
However, MaxCompute has consistently improved its global presence, performance, security, end-to-end development experience, and ecosystem.
MaxCompute is currently deployed in 15 regions worldwide, including Hong Kong, Singapore, Japan, Dubai, US West, US East, Australia, Indonesia, and India. It connects millions of servers to form a supercomputer capable of providing computing power to major Internet markets around the globe in the form of online public services.
Increased Performance and Efficient Development
MaxCompute's exabyte-level performance and processing make it the global leader in the field. In October 2017, MaxCompute completed the world's first public cloud-based 100 TB BigBench big data benchmark test, achieving a performance in excess of 7830 QPM.
The next-gen big data language NewSQL, which combines the advantages Declarative and Imperative coding, breaks through the technical restrictions of the previous SQL language. The unified programming language provides support for offline, quasi-real-time, stream, graphic, machine learning, and other computing modes and unstructured data processing. This greatly reduces the technical barriers to big data development.
Maximum Security
MaxCompute introduces multi-tenant cloud security isolation technology that upends the security limitations of traditional big data platforms. This technology refines security boundaries to the user, process, and CPU core levels. MaxCompute authorizes and audits millions of tenants and the tens of billions of tasks they perform each day to ensure financial-grade data security.
Comprehensive Data Modeling, Governance, and Integration
In response to the demand for big data construction, management, and applications in various industries, MaxCompute provides an all-in-one big data capability toolkit covering smart data construction and management for the entire process from data access to data consumption. This toolkit includes DataWorks, MaxCompute Studio, and other tools that help customers construct fully-integrated, asset- and service-oriented, self-optimizing, closed-loop smart data systems with unified standards, capable of driving innovation.
MaxCompute vs. the World
At only $354.7/QPM, MaxCompute provides its customers with a plethora of features that is comparable to, if not better than, similar products from competitors.
At the 2015 Sort Benchmark competition, Apsara set four new GraySort and MinuteSort world records.
- FuxiSort sorted 100 TB of data in 377 seconds.
- It used a shared testing environment, dual gigabit NICs, and mechanical hard disks.
At the 2016 Sort Benchmark competition, Apsara set two new CloudSort world records.
- NADSort sorted 100 TB of data at a cost of $144 ($1.44/TB).
- In 2017, MaxCompute adapted to the TPC benchmark and expanded its data scale to 100 TB. In a global first, MaxCompute completed the public cloud-based BigBench big data benchmark test. Its performance exceeded 8200 QPM, establishing itself as a computing leader, not only in China, but the world.
Compliance and Certifications
Data privacy and security is becoming an increasingly important topic in the modern society. To comply with the needs of customers, Alibaba Cloud MaxCompute has earned 10 industry and security certifications from respected third-party institutions.
Leadership in Regional Markets
As a leading cloud computing vendor, Alibaba Cloud serves 2.3 million customers in over 200 countries and regions around the world. The company holds a 47.6% share of the public cloud market in China, equaling to almost all competitors combined.
MaxCompute is dedicated to providing massive data storage and large-scale computing to its customers in 15 regions around the world, including Hong Kong, Singapore, Japan, Dubai, Europe, US West, US East, Australia, Indonesia, and India. In doing so, it empowers customers throughout the world with Alibaba Cloud's exceptional computing capabilities.
ofo Use Case
Using MaxCompute, the bike-sharing company ofo has started to establish data models and perform clustering to optimize its operation. By studying historical transaction data and user flow information, the company calculates the number of bicycles that need to be deployed in various areas, understand where bikes are taken from, and make suitable plans to recover bikes from low-traffic areas and increase deployment in high-traffic area.
In July 2017, ofo upgraded MaxCompute from 1.0 to 2.0. The new version increased the efficiency of offline operations by over 50% and allowed the company to process 32 million daily transactions with great ease. Overall, the upgrade increased the company's operational efficiency by 76%. At the same time, MaxCompute significantly reduced ofo's big data platform O&M costs. Now, the company only needs one part-time O&M employee. Compared to a self-built physical cluster, MaxCompute offers much lower total costs, while greatly increasing the efficiency of application development.
Beijing Genomics Institute (BGI) Use Case
Genetic technology is gradually moving out of the laboratory and into daily life. However, the resulting explosive growth in data volumes far exceeds the capability of traditional computing. In this context, BGI opted for MaxCompute.
In the Million Genomes Project, it takes traditional computing methods 3–5 days to analyze population structures. MaxCompute can complete the entire analysis within one hour, greatly accelerating data throughput and delivery. When performing structural analysis on the genetic data of one million people, the complexity of the process is beyond the capabilities of traditional computing. Using MaxCompute, BGI was able to achieve a technological breakthrough and compute the genetic distance between one individual and 100 thousand others in a matter of hours, while reducing the cost to under $1,000. Currently, BGI continues to explore and build on the advantages provided by MaxCompute.
Conclusion
In short, Alibaba Cloud MaxCompute provides multi-tenant big data warehousing and hybrid cloud services based on a public cloud. It is rapidly globalizing its services with special attention to the finance, Internet, retail, and e-commerce fields.
MaxCompute is a sophisticated product, with over nine years of experience. Its capability, along with its advanced technology and comprehensive big data development solutions, have earned it a leading position in the CDW market.