Stuff Goes Bad:Erlang In Anger

Profiling and Reduction Counts

To pin issues to specific pieces of Erlang code, as mentioned earlier, there are two main approaches. One will be to do the old standard profiling routine, likely using one of the following applications: 2
 • eprof, 3 the oldest Erlang profiler around. It will give general percentage values and will mostly report in terms of time taken.
 • fprof, 4 a more powerful replacement of eprof. It will support full concurrency and generate in-depth reports. In fact, the reports are so deep that they are usually considered opaque and hard to read.
 • eflame, 5 the newest kid on the block. It generates flame graphs to show deep call sequences and hot-spots in usage on a given piece of code. It allows to quickly find issues with a single look at the final result.
 It will be left to the reader to thoroughly read each of these application’s documentation. The other approach will be to run recon:proc_window/3 as introduced in Subsection 5.2.1:

 就像前面所说,为了定位到问题的具体Erlang代码,这有两种方法。一个针对旧的标准分析程序的,可能使用了下面的application 2:
 • eprof 3,最古老的Erlang分析工具。它可以得到通用的百分比数据和以terms形式返回的报告时间。

 • fprof4,一个强有力取代eprof的工具。它支持完整的并发和生成深度报告。事实上,报告嵌套非常深,通常被认为是不透明的,难以阅读。

 • eflame 5 最新的分析工具。它可以生成图表形式,用来显示指定代码的所有的调用序列(call sequences)和使用热点(hot-spots in usage)。它可以让你只简单的看下最终结果就能快速找到问题。
 读者可以仔细阅读上面每一个application的文档。
 另一个方法就是如章节5.2.1中所说的调用用recon:proc_window/3:

--------------------------------------------------
1> recon:proc_window(reductions, 3, 500).
[{<0.46.0>,51728,
[{current_function,{queue,in,2}},
{initial_call,{erlang,apply,2}}]},
{<0.49.0>,5728,
[{current_function,{dict,new,0}},
{initial_call,{erlang,apply,2}}]},
{<0.43.0>,650,
[{current_function,{timer,sleep,1}},
{initial_call,{erlang,apply,2}}]}]
--------------------------------------------------


 The reduction count has a direct link to function calls in Erlang, and a high count is usually the synonym of a high amount of CPU usage.
 What’s interesting with this function is to try it while a system is already rather busy, 6 with a relatively short interval. Repeat it many times, and you should hopefully see a pattern emerge where the same processes (or the same kind of processes) tend to always come up on top.
 Using the code locations 7 and current functions being run, you should be able to identify what kind of code hogs all your schedulers.

 Erlang中的reduction count与调用的函数有直接的关系,一个高的计数通常等价于高数量的CPU被使用。
 在系统已非常忙的时候6,如果这个函数做一些有趣的尝试:在相对较短的时间间隔内重复调用多次,你就可以看到相同的进程(或相同类型的进程)会跑到最顶部来。
 使用代码定位7和当前进行的函数,你就可以定位到你的所有分配者(all your schedulers)处在什么的代码问题(code hogs)。

[2] All of these profilers work using Erlang tracing functionality with almost no restraint. They will havean impact on the run-time performance of the application, and shouldn’t be used in production.
[3] http://www.erlang.org/doc/man/eprof.html
[4] http://www.erlang.org/doc/man/fprof.html
[5] https://github.com/proger/eflame
[6] See Subsection 5.1.2
[7] Call recon:info(PidTerm, location) or process_info(Pid, current_stacktrace) to get this information.

[注2]:所有这些Erlang分析器使用跟踪功能几乎没有限制。他们会对application运行时的性能有影响,不应该在生产系统中使用。
[注3]: http://www.erlang.org/doc/man/eprof.html
[注4]: http://www.erlang.org/doc/man/fprof.html
[注5]: https://github.com/proger/eflame
[注6]: 查看章节5.1.2
[注7]:调用 recon:info(PidTerm, location) 或 process_info(Pid, current_stacktrace)来得到这些信息