Moshe Kaplan of RockeTier shows the life cycle of an affiliate marketing
system that starts off as a cub handling one million events per day and ends up
a lion handling 200 million to even one billion events per day. The resulting
system uses ten commodity servers at a cost of $35,000.
Mr. Kaplan's paper is especially interesting because it documents a system
architecture evolution we may see a lot more of in the future: database
centric --> cache centric --> memory grid
.
As scaling and performance requirements for complicated operations increase,
leaving the entire system in memory starts to make a great deal of sense. Why
use cache at all? Why shouldn't your system be all in memory from the start?
General Approach to Evolving the System to Scale
Analyze the system architecture and the main business processes. Detect the
main hardware bottlenecks and the related business process causing them. Focus
efforts on points of greatest return.
Rate the bottlenecks by importance and provide immediate and practical
recommendation to improve performance.
Implement the recommendations to provide immediate relief to problems. Risk
is reduced by avoiding a full rewrite and spending a fortune on more
resources.
Plan a road map for meeting next generation solutions.
Scale up and scale out when redesign is necessary.
One Million Event
Per Day System
The events are common advertising system operations like: ad impressions,
clicks, and sales.
Typical two tier system. Impressions and banner sales are written directly
to the database.
The immediate goal was to process 2.5 million events per day so something
needed to be done.
2.5 Million Event Per Day System
PerfMon was used to check web server and DB performance counters. CPU usage
was at 100% at peak usage.
Immediate fixes included: tuning SQL queries, implementing stored
procedures, using a PHP
compiler, removing include files and fixing other programming errors.
The changes successfully double the performance of the system within 3
months. The next goal was to handle 20 million events per day.
20 Million Event Per Day System
To make this scaling leap a rethinking of how the system worked was in
order.
The main load of the system was validating inputs in order to prevent
forgery.
A cache was maintained in the application servers to cut unnecessary
database access. The result was 50% reduction in CPU utilization.
An in-memory database was used to accumulate transactions over time
(impression counting, clicks, sales recording).
A periodic process was used to write transactions from the in-memory
database to the database server.
This architecture could handle 20 million events using existing
hardware.
Business projections required a system that could handle 200 million events.
200 Million Event Per Day System
The next architectural evolution was to a scale out grid product. It's not
mentioned in the paper but I think GigaSpaces
was used.
A Layer 7 load balancer is used to route requests to sharded application
servers. Each app server supports a different set of banners.
Data is still stored in the database as the data is used for statistics,
reports, billing, fraud detection and so on.
Latency
was slashed because logic was separated out of the HTTP request/response loop
into a separate process and database persistence is done offline.
At this point architecture supports near-linear scaling and it's projected
that it can easily scale to a billion events per day.
分享到:
相关推荐
This article describes some of the methods used to get the world record of the computation of the digits of digits using an inexpensive desktop computer.
Computations performed by graph algorithms are data driven, and require a high degree of random data access. Despite the great progresses made in disk technology, it still cannot provide the level of ...
1 Billion Word Language Model Benchmark. The purpose of the project is to make available a standard training and test setup for language modeling experiments.
1 Billion Word Language Model Benchmark. The purpose of the project is to make available a standard training and test setup for language modeling experiments. PART 2
Fergus Henderson has been a software engineer at Google for over 10 years. He started programming as a kid in 1979, and went on to academic research in programming language ...billion times per day.
There are more than 10 billion transactions per day and more than ten million activity users per month, as a large social network app with more than hundred million subscribers in Africa, how can ...
UltraPose Synthesizing Dense Pose With 1 Billion Points by Human-Body Decoupling 3D Model
2contributed articlesIn 2 002, COVErITY commercialized3 a research static bug-finding tool.6,9 Not surprisingly, as academics, our view of commercial realities was not perfectly accurate. However,...
Billion-scale similarity search with GPUs
One million Uber rides are booked every day, 10 billion hours of Netflix videos are watched every month, and $1 trillion are spent on e-commerce web sites every year. The success of these services is ...
A minimum detectable absorption coefficient of 2 * 10^(-6) cm^(-1) is achieved with a mirror reflectivity of 99.24%, corresponding to a N_(2)O detection limit of 600 parts per billion.
查看吉他和弦。此软件可以使用话筒对木吉他调弦,方便准确。
We spend more than 4.8 hours a day on the Internet using a computer, and 2.1 hours using a mobile. Data, this new ethereal manna from heaven, is produced in real time. It comes in a continuous stream...
Billion Chords和弦查询工具(含注册机),软件我已经注册好了!
China_urban_billion_full_report.pdf
How to partition a billion-Node graph
Flipkart-Big-Billion-day解析器 我在编写的python脚本,用于监视并获得有关不同销售/报价的通知。 了解有关此脚本在所做的更多信息
查和弦的软件,和使用,作为一个只想下载点东西的人来说,这个几波网站我真的服了,我只能说我去年买了个包,超耐磨的。
和弦字典 billion chords
1 Billion Word Language Model Benchmark R13 Output 是一套新的基准语料库,被用于衡量和统计语言建模进展,凭借近 10 亿字的培训数据,该基准测试可以快速评估新的语言建模技术,并将其与其他新技术相结合。...