User Tools

Site Tools


map_suite_geocoder_performance_guide

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
map_suite_geocoder_performance_guide [2019/03/21 01:55]
tgwikiupdate
map_suite_geocoder_performance_guide [2019/03/21 02:49]
tgwikiupdate
Line 3: Line 3:
  
 The purpose of this guide is to help you to know following things: The purpose of this guide is to help you to know following things:
-  * A simple ​introduction for best practice of Map Suite Geocoder.+  * An introduction for best practice of Map Suite Geocoder.
   * Using multi-thread to improve the performance of batch query.   * Using multi-thread to improve the performance of batch query.
   * The bottleneck of Map Suite Geocoder performance.   * The bottleneck of Map Suite Geocoder performance.
Line 42: Line 42:
 To test the ultimate performance of Geocoder, we use the 2nd call in this guide. To test the ultimate performance of Geocoder, we use the 2nd call in this guide.
 ===== Benchmark ===== ===== Benchmark =====
-In this section we will help you to use multi-thread to improve the geocoding when doing huge queries, then show you the benchmark ​test results that we did.+In this section we will help you to use multi-thread to improve the geocoding when doing huge queries, then show you the benchmark results that we did.
 ==== Using Multi-thread ==== ==== Using Multi-thread ====
 With the limitation of cache and file data read, Geocoder doesn'​t support inner multi-thread to do the batch query. But it's bad if customer wants to do a huge query with more than 100,000 input texts, below code snippet will help you to improve the query performance:​ With the limitation of cache and file data read, Geocoder doesn'​t support inner multi-thread to do the batch query. But it's bad if customer wants to do a huge query with more than 100,000 input texts, below code snippet will help you to improve the query performance:​
Line 112: Line 112:
 Task.WaitAll(tasks.ToArray());​ Task.WaitAll(tasks.ToArray());​
 </​code>​ </​code>​
-We preprocess the input texts to split them to eight chunks, then generate eight tasks to do the batch queries, each task maintains independent Geocoder to avoid multi-thread error. ​Below Figure 4 shows the elapsed time compare:+We preprocess the input texts to split them to eight chunks, then generate eight tasks to do the batch queries, each task maintains independent Geocoder to avoid multi-thread error. ​
  
-{{:​map_suite_geocoder_performance_guide_004.png}} 
-\\ 
-//Figure 4. Query Compare.// 
  
-It shows that we improved about three times speed after using multi-thread. 
 ==== Benchmark Reports ==== ==== Benchmark Reports ====
-To compare the performance between single thread and multi-thread,​ we used the following machine hardware device to do the benchmark +To compare the performance between single thread and multi-thread,​ we used the following ​(Figure 2) machine hardware device to do the benchmark:
-  * CPUIntel® Core™ i7-4790 CPU @ 3.60GHz +
-  * Memory: 8.00 GB +
-  * Disk: Crucial CT500MX200SSD1 +
-  * System: Windows 10 64-bit +
-And the hardware usage when querying is: +
-  * CPU: 70% - 90% +
-  * Memory: 300 MB + +
-  * IO: 30 MB/Seconds+
  
-To dig the limitation of the Geocoder with multi-thread we also created different [[https://​aws.amazon.com/​ec2/​instance-types|Amazon® instances]] to do the same test in Figure 4. Below Figure 5 is the test results:+{{:​map_suite_geocoder_performance_guide_002.png}} 
 +\\ 
 +//Figure 2. Machine Hardware Device Information.//​ 
 + 
 +Below Figure 3 shows the benchmark result: 
 + 
 +{{:​map_suite_geocoder_performance_guide_003.png}} 
 +\\ 
 +//Figure 3. Single Thread & Multi-Thread Benchmark Result.// 
 + 
 +It shows that we improved about three times speed after using multi-thread. And the below Figure 4 shows the hardware usage when using multi-thread:​ 
 + 
 +{{:​map_suite_geocoder_performance_guide_004.png}} 
 +\\ 
 +//Figure 4. Hardware Usage Using Multi-Thread.//​ 
 + 
 +To dig the limitation of the Geocoder with multi-thread we also created different [[https://​aws.amazon.com/​ec2/​instance-types|Amazon® instances]] to do the same benchmark. Below Figure 5 is the test result:
  
 {{:​map_suite_geocoder_performance_guide_005.png}} {{:​map_suite_geocoder_performance_guide_005.png}}
 \\ \\
-//Figure 5. Performance Test Result on Amazon® Instance.//+//Figure 5. Amazon® Instance ​Benchmark Result.//
  
-Note that we set the thread count to eight because we always tend to make the thread count to equal with or less than the CPU core count, if you start too much threads, the query speed would decrease instead.+The match query speed increased significantly after using high performance hardware device. ​Note that we set the thread count to eight because we always tend to make the thread count to equal with or less than the CPU core count, if you start too much threads, the query speed would decrease instead.
 ===== Bottleneck of Performance ===== ===== Bottleneck of Performance =====
 In this section we will discuss what is the bottleneck of Map Suite Geocoder performance. In this section we will discuss what is the bottleneck of Map Suite Geocoder performance.
Line 144: Line 148:
 The another bottleneck is input/​output text normalization,​ customer wants to input various of texts to do the match query, Geocoder has to split and explain them at first, then compare the normalized text cluster with cache or file data to do the match. The matched results also need to be normalized as the Geocoder result. We always keep to update our normalization algorithm to make it more faster and accurate. The another bottleneck is input/​output text normalization,​ customer wants to input various of texts to do the match query, Geocoder has to split and explain them at first, then compare the normalized text cluster with cache or file data to do the match. The matched results also need to be normalized as the Geocoder result. We always keep to update our normalization algorithm to make it more faster and accurate.
 ==== Performance Test ==== ==== Performance Test ====
-Below Figure ​is the test result that did 1,000 queries one by one, the execution time is 4.669 seconds:+Below Figure ​is the analysis ​result that did **1,000** queries one by one, the execution time is **4.669** seconds:
  
-{{:map_suite_geocoder_performance_guide_002.png}}+{{:map_suite_geocoder_performance_guide_006.png}}
 \\ \\
-//​Figure ​2Performance Test With 1,000 Queries.//+//​Figure ​6Analysis Result ​With 1,000 Queries.//
  
-And the below Figure ​is the test result that did 10,000 queries one by one, the execution time is 16.588 seconds:+And the below Figure ​is the analysis ​result that did **10,000** queries one by one, the execution time is **16.588** seconds:
  
-{{:map_suite_geocoder_performance_guide_003.png}}+{{:map_suite_geocoder_performance_guide_007.png}}
 \\ \\
-//​Figure ​3Performance ​Test With 10,000 Queries.//+//​Figure ​7Analysis ​Test With 10,000 Queries.//
  
 It's obvious that the most time spent is on I/O (includes file or cache data read), then normalization. It's obvious that the most time spent is on I/O (includes file or cache data read), then normalization.
map_suite_geocoder_performance_guide.txt · Last modified: 2019/03/21 02:49 by tgwikiupdate