Challenge
With more than 300 million active users and total 2017 revenue of more than $55 billion,
JD.com is China’s largest retailer, and its operations are the epitome of hyperscale. For example, there are more than a trillion images in JD.com’s product databases—with 100 million being added daily—and this enormous amount of data needs to be instantly accessible. In 2014, JD.com moved its applications to containers running on bare metal machines using OpenStack and Docker to "speed up the delivery of our computing resources and make the operations much simpler," says Haifeng Liu, JD.com’s Chief Architect. But by the end of 2015, with tens of thousands of nodes running in multiple data centers, "we encountered a lot of problems because our platform was not strong enough, and we suffered from bottlenecks and scalability issues," says Liu. "We needed infrastructure for the next five years of development, now."
Solution
JD.com turned to Kubernetes to accommodate its clusters. At the beginning of 2016, the company began to transition from OpenStack to Kubernetes, and today, JD.com runs the world’s largest Kubernetes cluster. "Kubernetes has provided a strong foundation on top of which we have customized the solution to suit our needs as China’s largest retailer."
Impact
"We have greater data center efficiency, better managed resources, and smarter deployment with the Kubernetes platform," says Liu. Deployment time went from several hours to tens of seconds. Efficiency has improved by 20-30%, measured in IT costs. With the further optimizations the team is working on, Liu believes there is the potential to save hundreds of millions of dollars a year. But perhaps the best indication of success was the annual Singles Day shopping event, which ran on the Kubernetes platform for the first time in 2018. Over 11 days, transaction volume on JD.com was $23 billion, and "our e-commerce platforms did great," says Liu. "Infrastructure led the way to prep for 11.11. We took the approach of predicting volume, emulating the behavior of customers to prepare beforehand, and drilled for malfunctions. Because of Kubernetes’s scalability, we were able to handle an extremely high level of demand."