The trip to China and Hong Kong was quick but a lot was accomplished. The trip to South Korea was canceled as the meetings were not finalized. In any case this worked out well, stayed an extra day in Beijing with a friend, and got to see a local perspective on eats and checked the local area. It was a good trip to China.
I presented some ideas on large DC designs, primarily discussing some ideas around linking large IDCs together. MPLS VPNs (L3VPNs and L2VPNS) are usually the most straight-forward. You get the things you want on costly transit links such as QoS, Traffic Engineering, Load Balancing (ECMP, etc), and you also have the ability to handle overlapping address spaces in the event that you want to use the same addresses on machines in the production and staging areas of the provider network. Now what if there is only Internet connectivity between the DCs in either a primary or backup sense, is there still a way to deploy MPLS to link the DCs? Yes, it is quite feasible to use MPLS over GRE, and if you really needed encryption, you could even have MPLS over GRE, over IPSEC. The overhead is not nice, but it works.
It is worth pointing out that QoS and Traffic Engineering are not really feasible in an end-to-end approach when the tunnel is over GRE tunnels since the traffic will pass over a pure IP network that works in a best-effort basis.
We also discussed the use of Hadoop as a means to perform distributed computing on a large scale. All the big boys use Hadoop including Baidu, AWS, Alibaba, AOL, etc. The ideas that Hadoop presents are quite impressive. Take for example that they wrote a filesystem (HDFS) that is fully distributed across hundreds or even thousands of nodes, and uses the stock standard disks inside each machine because the IOPS are much higher than using a SAN. The idea is that it is easier to move the computation than it is to move data. I couldn’t agree more.