You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364
  1. pmacct [IP traffic accounting : BGP : BMP : IGP : Streaming Telemetry]
  2. pmacct is Copyright (C) 2003-2017 by Paolo Lucente
  3. Q1: What is pmacct project homepage ?
  4. A: pmacct homepage is http://www.pmacct.net/ . pmacct is also present on GitHub at
  5. the URL: https://github.com/pmacct/pmacct .
  6. Q2: 'pmacct', 'pmacctd', 'nfacctd', 'sfacctd', 'uacctd', 'pmtelemetryd',
  7. 'pmbgpd' and 'pmbmpd' -- but what do they mean ?
  8. A: 'pmacct' is intended to be the name of the project; 'pmacctd' is the name of the
  9. libpcap-based IPv4/IPv6 accounting daemon; 'nfacctd' is the name of the NetFlow
  10. (versions supported NetFlow v1 to v9) and IPFIX accounting daemon; 'sfacctd' is
  11. the name of the sFlow v2/v4/v5 accounting daemon; 'uacctd' is the name of the
  12. Linux Netlink NFLOG-based accounting daemon (historically, it was using ULOG,
  13. hence its name); 'pmtelemetryd' is the name of the Streaming Telemetry collector
  14. daemon, where, quoting Cisco IOS-XR Telemetry Configuration Guide at the time of
  15. this writing, "Streaming telemetry [ .. ] data can be used for analysis and
  16. troubleshooting purposes to maintain the health of the network. This is achieved
  17. by leveraging the capabilities of machine-to-machine communication. [ .. ]";
  18. 'pmbgpd' is the name of the pmacct BGP collector daemon; 'pmbmpd' is the name of
  19. the pmacct BMP collector daemon.
  20. Q3: Does pmacct stand for Promiscuous mode IP Accounting package?
  21. A: Not anymore, it was originally though: pmacct was born as a libpcap-based traffic
  22. collection project only. Over the time it evolved to include NetFlow first, sFlow
  23. shortly afterwards and NFLOG more recently. Latest additions being in the areas
  24. of BGP, BMP and Streaming Telemetry. However the unpronounceable name 'pmacct'
  25. remained as a distinctive signature of the project.
  26. Q4: What are pmacct main features?
  27. A: pmacct can collect, replicate and export network information. On the data plane
  28. (ie. IPv4/IPv6 traffic) it can cache in memory tables, store persistently to
  29. RDBMS (MySQL, PostgreSQL, SQLite 3.x), noSQL databases (key-value: BerkeleyDB
  30. 5.x via SQLite API or document-oriented: MongoDB) and flat-files (CSV, formatted,
  31. JSON, Apache Avro output), publish to AMQP and Kafka brokers (ie. to insert in
  32. ElasticSearch, InfluxDB or Cassandra). Export speaking sFlow v5, NetFlow v1/v5/v9
  33. and IPFIX. pmacct is able to perform data aggregation, offering a rich set of
  34. primitives to choose from; it can also filter, sample, renormalize, tag and
  35. classify at L7. On the control and infrastructure planes it can collect and
  36. publish to AMQP and Kafka brokers BGP, BMP, IGP and Streaming Telemetry data
  37. both standalone and as correlation/enrichment of data plane information.
  38. Q5: Do pmacct IPv4/IPv6 traffic accounting daemons log to flat files?
  39. A: Yes. But while in other tools flat-files are typically used to log every micro-flow
  40. (or whatever aggregation the NetFlow agents have been configured to export with) and
  41. work in a two-stages fashion, ie. a) write down to persistent storage then b)
  42. consolidate to build the desired view, by inception, pmacct always aimed to a
  43. single-stage approach instead, ie. offer data reduction tecniques and correlation
  44. tools to process network traffic data on the fly, so to offer immediate view(s) of
  45. the traffic. pmacct writes to files in text-format (JSON, Avro, CSV or formatted via
  46. 'print' plugin, and JSON or Avro via Kafka and AMQP plugins, see QUICKSTART doc for
  47. further information) so to maximize integration with 3rd party tools while keeping
  48. low the effort of customization.
  49. Q6: What are the options to scale a pmacct deployment to match input data rate?
  50. A: There are two dimensions to it: 1) scale within the same instance of pmacct: make use
  51. of data reduction tecniques part of pmacct, ie. spatial and temporal aggregation,
  52. filtering, sampling and tagging. As these features are fully configurable, going from
  53. full micro-flow visibility to - say - node-to-node IP network traffic matrix, data
  54. granularity/resolution can be traded off for scalability/resources consumption; 2)
  55. divide-and-conquer input data over a set of pmacct instances by either balancing or
  56. mapping data onto collectors. See next point, Q7, for libpcap; the 'tee' plugin can
  57. be used for this purpose for NetFlow, IPFIX and sFlow.
  58. Q7: I see my libpcap-based daemon (pmacctd) taking much CPU cycles; is there a way to
  59. reduce the load?
  60. A: CPU cycles are proportional to the amount of traffic (packets, flows, samples) that
  61. the daemon receives; in case of pmacctd it's possible to reduce the CPU share by
  62. avoiding unnecessary copies of data, also optimizing and buffering the necessary
  63. ones. Kernel-to-userspace copies are critical and hence the first to be optimized;
  64. for this purpose you may look at the following solutions:
  65. Linux kernel has support for mmap() since 2.4. The kernel needs to
  66. be 2.6.34+ or compiled with option CONFIG_PACKET_MMAP. You need at
  67. least a 2.6.27 to get compatibility with 64bit. Starting from 3.10,
  68. you get 20% increase of performance and packet capture rate. You
  69. also need a matching libpcap library. mmap() support has been added
  70. in 1.0.0. To take advantage of the performance boost from Linux
  71. 3.10, you need at least libpcap 1.5.0.
  72. PF_RING, http://www.ntop.org/PF_RING.html : it's a new type of network socket that
  73. improves the packet capture speed; it's available for Linux kernels 2.[46].x; it's
  74. kernel based; has libpcap support for seamless integration with existing applications.
  75. Device polling: it's available since FreeBSD 4.5REL kernel and needs just kernel
  76. recompilation (with "options DEVICE_POLLING"), and a polling-aware NIC. Linux kernel
  77. 2.6.x also supports device polling.
  78. Internal buffering can also help and, contrary to the previous tecniques, applies
  79. to all daemons. Buffering is enabled with the plugin_buffer_size directive; buffers
  80. can then be queued and distributed with a choice of an home-grown circolar queue
  81. implementation (plugin_pipe_size) or a ZeroMQ queue (plugin_pipe_zmq). Check both
  82. CONFIG-KEYS and QUICKSTART for more information.
  83. Q8: I want to to account both inbound and outbound traffic of my network, with an host
  84. breakdown; how to do that in a savy fashion? Do i need to run two daemon instances
  85. one per traffic direction?
  86. A: No, this is a toy case where you will be able to leverage the pluggable architecture
  87. of the pmacct daemons: you will run a single daemon with two plugins attached to it;
  88. each of these will get part of the traffic (aggregate_filter), either outbound or
  89. inbound. A sample config snippet follows:
  90. ...
  91. aggregate[inbound]: dst_host
  92. aggregate[outbound]: src_host
  93. aggregate_filter[inbound]: dst net 192.168.0.0/16
  94. aggregate_filter[outbound]: src net 192.168.0.0/16
  95. plugins: mysql[inbound], mysql[outbound]
  96. sql_table[inbound]: acct_in
  97. sql_table[outbound]: acct_out
  98. ...
  99. It will account all traffic directed to your network into the 'acct_in' table and
  100. all traffic it generates into 'acct_out' table. Furthermore, if you actually need
  101. totals (inbound plus outbound traffic), you will just need to play around with
  102. basic SQL queries.
  103. If you are only interested into having totals instead, you may alternatively use
  104. the following piece of configuration:
  105. ...
  106. aggregate: sum_host
  107. plugins: mysql
  108. networks_file: /usr/local/pmacct/etc/networks.lst
  109. ...
  110. Where 'networks.lst' is a file where to define local network prefixes.
  111. Q9: I'm intimately fashioned by the idea of storing every single flow flying through my
  112. network, before making up my mind what to do with such data: i basically would like
  113. to (de-)aggregate my traffic as 'src_host, dst_host, src_port, dst_port, proto' or
  114. 'src_host, dst_host, src_port, dst_port, proto, timestamp_start, timestamp_end'. Is
  115. this feasible without any filtering?
  116. A: If such data granularity is required by the use-case addressed, ie. DDoS, forensics,
  117. security, research, etc. then this can be achieved no problem with pmacct - you have
  118. only to be careful planning for the right amount of system/cluster resources. In all
  119. other cases this is not adviceable as this would result in a huge matrix of data -
  120. meaning increased CPU, memory and disk usage - for no benefit - plus, to be always
  121. considered, the impact of unexpected network events (ie. port scans, DDoS, etc.) on
  122. the solution.
  123. Q10: I use pmacctd. What portion of the packets is included into the bytes counter ?
  124. A: The portion of the packet accounted starts from the IPv4/IPv6 header (inclusive) and
  125. ends with the last bit of the packet payload. This means that are excluded from the
  126. accounting: packet preamble (if any), link layer headers (e.g. ethernet, llc, etc.),
  127. MPLS stack length, VLAN tags size and trailing FCS (if any). This is the main reason
  128. of skews reported while comparing pmacct counters to SNMP ones. However, by having
  129. available a counter of packets, accounting for the missing portion is, in most cases,
  130. a simple math exercise which depends on the underlying network architecture.
  131. Example: Ethernet header = 14 bytes, Preamble+SFD (Start Frame Delimiter) = 8 bytes,
  132. FCS (Framke Check Sequence) = 4 bytes. It results in an addition of a maximum of 26
  133. bytes (14+8+4) for each packet. The use of VLANs will result in adding 4 more bytes
  134. to the forementioned 26.
  135. If using an SQL plugin, starting from release 0.9.2, such adjustment can be achieved
  136. directly within pmacct via the 'adjb' action (sql_preprocess).
  137. Q11: What is historical accounting feature and how to get it configured?
  138. A: pmacct allows to optionally define arbitrary time-bins (ie. 5 mins, 1 hour, etc.)
  139. and assign collected data to it basing on a timestamp. This is in brief called
  140. historical accounting and is enabled via *history* directives (ie. print_history,
  141. print_history_roundoff, sql_history, etc.). The time-bin to which data is allocated
  142. to is stored in the 'stamp_inserted' field (if supported by the plugin in use, ie.
  143. all except 'print', where to avoid redundancy this is encoded as part of the file
  144. name, and 'memory'). Flow data is by default assigned to a time-bin basing on its
  145. start time or - not applying that or missing such info - the timestamp of the whole
  146. datagram or - not applying that or missing such info - the time of arrival at the
  147. collector. Where multiple choices are supported, ie. NetFlow/IPFIX, the directive
  148. nfacctd_time_new allows to explicitely select the time source.
  149. Q12: Counters via CLI are good for (email, web) reporting but not enough. What are the
  150. options to graph network data?
  151. A: An option could be to use traditional graphing tools like RRDtool, MRTG and GNUplot
  152. in conjunction with the 'memory' plugin. The plugin works as a cache and offers a
  153. pull mechanism, the pmacct IMT client tool, that allows to easily retrieve counters:
  154. shell> ./pmacctd -D -c src_host -P memory -i eth0
  155. shell> ./pmacct -c src_host -N 192.168.4.133 -r
  156. 2339
  157. shell>
  158. Et voila'! This is the bytes counter. Because of the '-r', counters will get reset
  159. or translating into the RRDTool jargon, each time you will get an 'ABSOLUTE' value.
  160. Let's now encapsulate our query into, say, RRDtool commandline:
  161. shell> rrdtool update 192_168_4_133.rrd N:`./pmacct -c src_host -N 192.168.4.133 -r`
  162. Multiple requests can be batched as part of a single query, each request can be ';'
  163. separated via CLI or read from an input file (one query per line):
  164. shell> ./pmacct -c src_host,dst_host -N 192.168.4.133,192.168.0.101;192.168.4.5,192.168.4.1;... -r
  165. 50905
  166. 1152
  167. ...
  168. OR
  169. shell> ./pmacct -c src_host,dst_host -N "file:queries.list" -r
  170. ...
  171. shell> cat queries.list
  172. 192.168.4.133,192.168.0.101
  173. 192.168.4.5,192.168.4.1
  174. ...
  175. A second option is to leverage one of the several modern data analytics stacks that
  176. do typically comprise of data manipulation, storage and visualization. Pointers in
  177. this sense would be the ELK stack (ElasticSearch, Logstash, Kibana) or the TICK
  178. stack (Telegraf, InfluxDB, Chronograf, Kapacitor). Much more exist.
  179. Q13: The network equipment i'm using supports sFlow but i don't know how to enable it.
  180. I'm unable to find any sflow-related command. What to do ?
  181. A: If you are unable to enable sFlow commandline, you have to resort to the SNMP way.
  182. The sFlow MIB is documented into the RFC 3176; all you will need is to enable a SNMP
  183. community with both read and write access. Then, continue using the sflowenable tool
  184. available at the following URL: http://www.inmon.com/technology/sflowenable
  185. Q14: When i launch either nfacctd or sfacctd i receive the following error
  186. message: ERROR ( default/core ): socket() failed. What to do ?
  187. A: When IPv6 code is enabled, sfacctd and nfacctd will try to fire up an IPv6 socket.
  188. The error message is very likely to be caused by the proper kernel module not being
  189. loaded. So, try either to load it or specify an IPv4 address to bind to. If using a
  190. configuration file, add a line like 'nfacctd_ip: 192.168.0.14'; otherwise if going
  191. commandline, use the following: 'nfacctd [ ... options ... ] -L 192.168.0.14'.
  192. Q15: SQL table versions, what they are -- why and when do i need them ? Also, can i
  193. customize SQL tables ?
  194. A: pmacct tarball gets with so called 'default' tables (IP and BGP); they are built
  195. by SQL scripts stored in the 'sql/' section of the tarball. Default tables enable
  196. to start quickly with pmacct out-of-the-box; this doesn't imply they are suitable
  197. as-is to larger installations. SQL table versioning is used to introduce features
  198. over the time without breaking backward compatibility when upgrading pmacct. The
  199. most updated guide on which version to use given a required feature-set can be,
  200. once again, found in the 'sql/' section of the tarball.
  201. SQL tables *can* be fully customized so that primitives of interest can be freely
  202. mixed and matched - hence making a SQL table to perfectly adhere to the required
  203. feature-set. This is achieved by setting the 'sql_optimize_clauses' configuration
  204. key. You will then be responsible for building the custom schema and indexes.
  205. Q16: What is the best way to kill a running instance of pmacct avoiding data loss ?
  206. A: Two ways. a) Simply kill a specific plugin that you don't need anymore: you will
  207. have to identify it and use the 'kill -INT <process number>' command; b) kill the
  208. whole pmacct instance: you can either use the 'killall -INT <daemon name>' command
  209. or identify the Core Process and use the 'kill -INT <process number>' command. All
  210. of these, will do the job for you: will stop receiving new data from the network,
  211. clear the memory buffers, notify the running plugins to take th exit lane (which
  212. in turn will clear cached data as required).
  213. To identify the Core Process you can either take a look to the process list (on
  214. the Operating Systems where the setproctitle() call is supported by pmacct) or
  215. use the 'pidfile' (-F) directive. Note also that shutting down nicely the daemon
  216. improves restart turn-around times: the existing daemon will, first thing, close
  217. its listening socket while the newly launched one will mostly take advantage of
  218. the SO_REUSEADDR socket option.
  219. Q17: I find interesting store network data in a SQL database. But i'm actually hitting
  220. poor performances. Do you have any tips to improve/optimize things ?
  221. A: Few hints are summed below in order to improve SQL database performances. They are
  222. not really tailored to a specific SQL engine but rather of general applicability.
  223. Many thanks to Wim Kerkhoff for the many suggestions he contributed on this topic
  224. over the time:
  225. * Keep the SQL schema lean: include only required fields, strip off all the others.
  226. Set the 'sql_optimize_clauses' configuration key in order to flag pmacct you are
  227. going to use a custom-built table.
  228. * Avoid SQL UPDATEs as much as possible and use only INSERTs. This can be achieved
  229. by setting the 'sql_dont_try_update' configuration key. A pre-condition is to let
  230. sql_history == sql_refresh_time. UPDATEs are demanding in terms of resources and
  231. are, for simplicity, enabled by default.
  232. * If the previous point holds, then look for and enable database-specific directives
  233. aimed to optimize performances ie. sql_multi_values for MySQL and sql_use_copy for
  234. PostgreSQL.
  235. * Don't rely automagically on standard indexes but enable optimal indexes based on
  236. clauses you (by means of reports, 3rd party tools, scripts, etc.) and pmacct use
  237. the most to SELECT data. Then remove every unused index.
  238. * See if the dynamic table strategy offered by pmacct fits the bill: helps keeping
  239. SQL tables and indexes of a manageable size by rotating SQL tables (ie. per hour,
  240. day, week, etc.). See the 'sql_table_schema' configuration directive.
  241. * Run all SELECT and UPDATE queries under the "EXPLAIN ANALYZE ..." method to see
  242. if they are actually hitting the indexes. If not, you need to build indexes that
  243. better fit the actual scenario.
  244. * Sometimes setting "SET enable_seqscan=no;" before a SELECT query can make a big
  245. difference. Also don't underestimate the importance of daily VACUUM queries: 3-5
  246. VACUUMs + 1 VACUUM FULL is generally a good idea. These tips hold for PostgreSQL.
  247. * MyISAM is a lean SQL engine; if there is no concurrence, it might be preferred to
  248. InnoDB. Lack of transactions can reveal painful in case of unsecured shutdowns,
  249. requiring data recovery. This applies to MySQL only.
  250. * Disabling fsync() does improve performance. This might have painful consequences
  251. in case of unsecured shutdowns (remember power failure is a variable ...).
  252. Q18: Does having the local timezone configured on servers, routers, etc. - which can
  253. very well include DST (Daylight Saving Time) shifts, impact accounting?
  254. A: It is good rule to run the infrastructure and the backend part of the accounting
  255. system as UTC; for example, accuracy can be negatively impacted if sampled flows
  256. are cached on a router while the DST shift takes place; plus, pmacct uses system
  257. clock to calculate time-bins and scanner deadlines among the others. In short,
  258. the use of local timezones is not recommended.
  259. Q19: I'm using the 'tee' plugin with transparent mode set to true and keep receiving
  260. "Can't bridge Address Families when in transparent mode. Exiting ..." messages,
  261. why?
  262. A: It means you can't receive packets on an IPv4 address and transparently replicate
  263. to an IPv6 collector or vice-versa. Less obvious scenarios are: a) some operating
  264. systems where loopback (127.0.0.1) is considered a different address family hence
  265. it's not possible to replicate to a 127.0.0.1 address; it's possible though to use
  266. any locally configured IPv4 address bound to a (sub-)interface in 'up' state; b)
  267. an IPv4-mapped IPv6 address is still technically an IPv6 address hence on servers
  268. running IPv4 and IPv6 it is good practice to explicitely define also the receiving
  269. IP address (nfacctd_ip), if IPv4 is used.
  270. Q20: I'm using IPv6 support in pmacct. Even though the daemon binds to the "::"
  271. address, i don't receive NetFlow/IPFIX/sFlow/BGP data sent via IPv4, why?
  272. A: Binding to a "::" address (ie. no [sn]facctd_ip specified should allow to receive
  273. both IPv4 and IPv6 senders. IPv4 ones should be reportd in 'netstat' as IPv4-mapped
  274. IPv6 addresses. Linux has a kernel switch to enable/disable the functionality and
  275. its status can be checked via the /proc/sys/net/ipv6/bindv6only . Historically the
  276. default has been '0'. It appears over time some distributions have changed the
  277. default to be '1'. If you experience this issue on Linux, please check your kernel
  278. setting.
  279. Q21: How can i count how much telemetry data (ie. NetFlow, sFlow, IPFIX, Streaming
  280. Telemetry) i'm receiving on my collector?
  281. A: If the interface where telemetry data is received is dedicated to the task then any
  282. ifconfig, netstat or dstat tools or SNMP meaurement would do in order to verify
  283. amount of telemetry packets and bytes (from which packets per second, bytes per
  284. second can be easily inferred). If, instead, the interface is shared then pmacctd,
  285. the libpcap-based daemon, can help to isolate and account for the telemetry traffic;
  286. guess telemetry data is pointed to UDP port 2100 of the IP address configured on
  287. eth0, pmacctd can be started as "pmacctd -i eth0 -P print -c none port 2100" to
  288. account for the grand total of telemetry packets and bytes; if a breakdown per
  289. telemetry exporting node is wanted, the following command-line can be used: "pmacctd
  290. -i eth0 -P print -c src_host port 2100"; this example is suitable for manual reading
  291. as it will print data every 60 secs on the screen and can, of course, be complicated
  292. slightly to make it suitable for automation. A related question that often arises
  293. is: how many flows per second am i receiving? This can be similarly addressed by
  294. using "nfacctd -P print -c flows" for NetFlow/IPFIX and "sfacctd -P print -c flows"
  295. for sFlow. Here FLOWS is the amount of flow records (NetFlow/IPFIX) or flow samples
  296. (sFlow) processed in the period of time, and is the measure of interest. Changing
  297. the aggregation argument in "-c peer_src_ip,flows" gives the amount of flows per
  298. telemetry exporter (ie. router).
  299. /* EOF */