summaryrefslogtreecommitdiff
path: root/Documentation/networking
AgeCommit message (Collapse)Author
2015-04-14Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
The dwmac-socfpga.c conflict was a case of a bug fix overlapping changes in net-next to handle an error pointer differently. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-09ixgb: remove references to ifconfigStephen Hemminger
Move documentation into this century, even if this device hasn't been available for some time. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-09ixgbe: fix documentationStephen Hemminger
The MTU values in the documentation do not match the source. The source has frame limit of IXGBE_MAX_JUMBO_FRAME_SIZE (9728) which is MTU of 9710 because of the accounting for Ethernet header and CRC. Also, don't refer to the obsolete ifconfig command. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-09igb: doc don't refer to ifconfigStephen Hemminger
ifconfig command is obsolete, best to remove all references so that new users learn ip. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-04-08RDS: Documentation: Document AF_RDS, PF_RDS and SOL_RDS correctly.Sowmini Varadhan
AF_RDS, PF_RDS and SOL_RDS are available in header files, and there is no need to get their values from /proc. Document this correctly. Fixes: 0c5f9b8830aa ("RDS: Documentation") Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-04-01can: introduce new raw socket option to join the given CAN filtersOliver Hartkopp
The CAN_RAW socket can set multiple CAN identifier specific filters that lead to multiple filters in the af_can.c filter processing. These filters are indenpendent from each other which leads to logical OR'ed filters when applied. This socket option joines the given CAN filters in the way that only CAN frames are passed to user space that matched *all* given CAN filters. The semantic for the applied filters is therefore changed to a logical AND. This is useful especially when the filterset is a combination of filters where the CAN_INV_FILTER flag is set in order to notch single CAN IDs or CAN ID ranges from the incoming traffic. As the raw_rcv() function is executed from NET_RX softirq the introduced variables are implemented as per-CPU variables to avoid extensive locking at CAN frame reception time. Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2015-03-24filter: introduce SKF_AD_VLAN_TPID BPF extensionMichal Sekletar
If vlan offloading takes place then vlan header is removed from frame and its contents, both vlan_tci and vlan_proto, is available to user space via TPACKET interface. However, only vlan_tci can be used in BPF filters. This commit introduces a new BPF extension. It makes possible to load the value of vlan_proto (vlan TPID) to register A. Support for classic BPF and eBPF is being added, analogous to skb->protocol. Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@plumgrid.com> Cc: Jiri Pirko <jpirko@redhat.com> Signed-off-by: Michal Sekletar <msekleta@redhat.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@plumgrid.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23ipv6: add documentation for stable_secret, idgen_delay and idgen_retries knobsHannes Frederic Sowa
Cc: Erik Kline <ek@google.com> Cc: Fernando Gont <fgont@si6networks.com> Cc: Lorenzo Colitti <lorenzo@google.com> Cc: YOSHIFUJI Hideaki/吉藤英明 <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-23af_packet: pass checksum validation status to the userAlexander Drozdov
Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the af_packet user that at least the transport header checksum has been already validated. For now, the flag may be set for incoming packets only. Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com> Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-20net: neighbour: Document {mcast, ucast}_solicit, mcast_resolicit.YOSHIFUJI Hideaki/吉藤英明
Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-18net: Add max rate tx queue attributeJohn Fastabend
This adds a tx_maxrate attribute to the tx queue sysfs entry allowing for max-rate limiting. Along with DCB-ETS and BQL this provides another knob to tune queue performance. The limit units are Mbps. By default it is disabled. To disable the rate limitation after it has been set for a queue, it should be set to zero. Signed-off-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-08neterion: remove reference to ifconfigstephen hemminger
Remove reference to obsolete ifconfig command. MTU can be changed with ip command instead. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-06ipv4: Documenting two sysctls for tcp PMTU probeFan Du
Namely tcp_probe_interval to control how often to restart a probe. And tcp_probe_threshold to control when stop the probing in respect to the width of search range in bytes Signed-off-by: Fan Du <fan.du@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-04mpls: Add a sysctl to control the size of the mpls label tableEric W. Biederman
This sysctl gives two benefits. By defaulting the table size to 0 mpls even when compiled in and enabled defaults to not forwarding any packets. This prevents unpleasant surprises for users. The other benefit is that as mpls labels are allocated locally a dense table a small dense label table may be used which saves memory and is extremely simple and efficient to implement. This sysctl allows userspace to choose the restrictions on the label table size userspace applications need to cope with. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-03-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next A small batch with accumulated updates in nf-next, mostly IPVS updates, they are: 1) Add 64-bits stats counters to IPVS, from Julian Anastasov. 2) Move NETFILTER_XT_MATCH_ADDRTYPE out of NETFILTER_ADVANCED as docker seem to require this, from Anton Blanchard. 3) Use boolean instead of numeric value in set_match_v*(), from coccinelle via Fengguang Wu. 4) Allows rescheduling of new connections in IPVS when port reuse is detected, from Marcelo Ricardo Leitner. 5) Add missing bits to support arptables extensions from nft_compat, from Arturo Borrero. Patrick is preparing a large batch to enhance the set infrastructure, named expressions among other things, that should follow up soon after this batch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-25ipvs: allow rescheduling of new connections when port reuse is detectedMarcelo Ricardo Leitner
Currently, when TCP/SCTP port reusing happens, IPVS will find the old entry and use it for the new one, behaving like a forced persistence. But if you consider a cluster with a heavy load of small connections, such reuse will happen often and may lead to a not optimal load balancing and might prevent a new node from getting a fair load. This patch introduces a new sysctl, conn_reuse_mode, that allows controlling how to proceed when port reuse is detected. The default value will allow rescheduling of new connections only if the old entry was in TIME_WAIT state for TCP or CLOSED for SCTP. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Simon Horman <horms@verge.net.au>
2015-02-23pktgen: Correct documentation of module name and commandBen Hutchings
Drop the '.o' suffix so this text properly covers both the built-in and modular cases. 'insmod pktgen' obviously won't work; the command should be modprobe. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23samples/pktgen: Add sample scripts for pktgen facilityBen Hutchings
These are Robert Olsson's samples which used to be available from <ftp://robur.slu.se/pub/Linux/net-development/pktgen-testing/examples/> but currently are not. Change the documentation to refer to these consistently as 'sample scripts', matching the directory name used here. Cc: Robert Olsson <robert@herjulf.se> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23pktgen: Fix grammar errors and some poor wording in documentationBen Hutchings
Thanks to Rob Jones for suggesting some of the changes. Cc: Rob Jones <rob.jones@codethink.co.uk> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-23pktgen: Delete the original date from documentationBen Hutchings
This has been updated quite a few times since 2004, and git can keep track of the actual date for us. Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-11Merge tag 'docs-for-linus' of git://git.lwn.net/linux-2.6Linus Torvalds
Pull documentation updates from Jonathan Corbet: "Highlights this time around include: - A thrashing of SubmittingPatches to bring it out of the "send everything to Linus" era of kernel development. - A new document on completions from Nicholas McGuire - Lots of typo fixes, formatting improvements, corrections, build fixes, and more" * tag 'docs-for-linus' of git://git.lwn.net/linux-2.6: (35 commits) Documentation: Fix the wrong command `echo -1 > set_ftrace_pid` for cleaning the filter. can-doc: Fixed a wrong filepath in can.txt Documentation: Fix trivial typo in comment. kgdb,docs: Fix typo and minor style issues Documentation: add description for FTRACE probe status doc: brief user documentation for completion Documentation/misc-devices/mei: Fix indentation of embedded code. Documentation/misc-devices/mei: Fix indentation of enumeration. Documentation/misc-devices/mei: Fix spacing around parentheses. Documentation/misc-devices/mei: Fix formatting of headings. Documentation: devicetree: Fix double words in Doumentation/devicetree Documentation: mm: Fix typo in vm.txt lockstat: Add documentation on contention and contenting points Documentation: fix blackfin gptimers-example build errors Fixes column alignment in table of contents entry 1.9 in Documentation/filesystems/proc.txt CodingStyle: enable emacs display of trailing whitespace DocBook: Do not exceed argument list limit gpio: board.txt: Fix the gpio name example Documentation/SubmittingPatches: unify whitespace/tabs for the DCO MAINTAINERS: Add the docs-next git tree to the maintainer entry ...
2015-02-08tcp: helpers to mitigate ACK loops by rate-limiting out-of-window dupacksNeal Cardwell
Helpers for mitigating ACK loops by rate-limiting dupacks sent in response to incoming out-of-window packets. This patch includes: - rate-limiting logic - sysctl to control how often we allow dupacks to out-of-window packets - SNMP counter for cases where we rate-limited our dupack sending The rate-limiting logic in this patch decides to not send dupacks in response to out-of-window segments if (a) they are SYNs or pure ACKs and (b) the remote endpoint is sending them faster than the configured rate limit. We rate-limit our responses rather than blocking them entirely or resetting the connection, because legitimate connections can rely on dupacks in response to some out-of-window segments. For example, zero window probes are typically sent with a sequence number that is below the current window, and ZWPs thus expect to thus elicit a dupack in response. We allow dupacks in response to TCP segments with data, because these may be spurious retransmissions for which the remote endpoint wants to receive DSACKs. This is safe because segments with data can't realistically be part of ACK loops, which by their nature consist of each side sending pure/data-less ACKs to each other. The dupack interval is controlled by a new sysctl knob, tcp_invalid_ratelimit, given in milliseconds, in case an administrator needs to dial this upward in the face of a high-rate DoS attack. The name and units are chosen to be analogous to the existing analogous knob for ICMP, icmp_ratelimit. The default value for tcp_invalid_ratelimit is 500ms, which allows at most one such dupack per 500ms. This is chosen to be 2x faster than the 1-second minimum RTO interval allowed by RFC 6298 (section 2, rule 2.4). We allow the extra 2x factor because network delay variations can cause packets sent at 1 second intervals to be compressed and arrive much closer. Reported-by: Avery Fay <avery@mixpanel.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/vxlan.c drivers/vhost/net.c include/linux/if_vlan.h net/core/dev.c The net/core/dev.c conflict was the overlap of one commit marking an existing function static whilst another was adding a new function. In the include/linux/if_vlan.h case, the type used for a local variable was changed in 'net', whereas the function got rewritten to fix a stacked vlan bug in 'net-next'. In drivers/vhost/net.c, Al Viro's iov_iter conversions in 'net-next' overlapped with an endainness fix for VHOST 1.0 in 'net'. In drivers/net/vxlan.c, vxlan_find_vni() added a 'flags' parameter in 'net-next' whereas in 'net' there was a bug fix to pass in the correct network namespace pointer in calls to this function. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-05can-doc: Fixed a wrong filepath in can.txtStefan Tatschner
<linux/can/error.h> moved in the big UAPI shuffle; update the document to note its new location. Signed-off-by: Stefan Tatschner <stefan@sevenbyte.org> [jc: added changelog] Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2015-02-02Documentation: Update netlink_mmap.txtRichard Weinberger
Update netlink_mmap.txt wrt. commit 4682a0358639b29cf ("netlink: Always copy on mmap TX."). Signed-off-by: Richard Weinberger <richard@nod.at> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02net-timestamp: no-payload option in txtimestamp testWillem de Bruijn
Demonstrate how SOF_TIMESTAMPING_OPT_TSONLY can be used and test the implementation. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-02-02net-timestamp: no-payload optionWillem de Bruijn
Add timestamping option SOF_TIMESTAMPING_OPT_TSONLY. For transmit timestamps, this loops timestamps on top of empty packets. Doing so reduces the pressure on SO_RCVBUF. Payload inspection and cmsg reception (aside from timestamps) are no longer possible. This works together with a follow on patch that allows administrators to only allow tx timestamping if it does not loop payload or metadata. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes (rfc -> v1) - add documentation - remove unnecessary skb->len test (thanks to Richard Cochran) Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-26openvswitch: Add support for unique flow IDs.Joe Stringer
Previously, flows were manipulated by userspace specifying a full, unmasked flow key. This adds significant burden onto flow serialization/deserialization, particularly when dumping flows. This patch adds an alternative way to refer to flows using a variable-length "unique flow identifier" (UFID). At flow setup time, userspace may specify a UFID for a flow, which is stored with the flow and inserted into a separate table for lookup, in addition to the standard flow table. Flows created using a UFID must be fetched or deleted using the UFID. All flow dump operations may now be made more terse with OVS_UFID_F_* flags. For example, the OVS_UFID_F_OMIT_KEY flag allows responses to omit the flow key from a datapath operation if the flow has a corresponding UFID. This significantly reduces the time spent assembling and transacting netlink messages. With all OVS_UFID_F_OMIT_* flags enabled, the datapath only returns the UFID and statistics for each flow during flow dump, increasing ovs-vswitchd revalidator performance by 40% or more. Signed-off-by: Joe Stringer <joestringer@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-25net: ipv6: Add sysctl entry to disable MTU updates from RAHarout Hedeshian
The kernel forcefully applies MTU values received in router advertisements provided the new MTU is less than the current. This behavior is undesirable when the user space is managing the MTU. Instead a sysctl flag 'accept_ra_mtu' is introduced such that the user space can control whether or not RA provided MTU updates should be applied. The default behavior is unchanged; user space must explicitly set this flag to 0 for RA MTUs to be ignored. Signed-off-by: Harout Hedeshian <harouth@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== netfilter updates for net-next The following patchset contains netfilter updates for net-next, just a bunch of cleanups and small enhancement to selectively flush conntracks in ctnetlink, more specifically the patches are: 1) Rise default number of buckets in conntrack from 16384 to 65536 in systems with >= 4GBytes, patch from Marcelo Leitner. 2) Small refactor to save one level on indentation in xt_osf, from Joe Perches. 3) Remove unnecessary sizeof(char) in nf_log, from Fabian Frederick. 4) Another small cleanup to remove redundant variable in nfnetlink, from Duan Jiong. 5) Fix compilation warning in nfnetlink_cthelper on parisc, from Chen Gang. 6) Fix wrong format in debugging for ctseqadj, from Gao feng. 7) Selective conntrack flushing through the mark for ctnetlink, patch from Kristian Evensen. 8) Remove nf_ct_conntrack_flush_report() exported symbol now that is not required anymore after the selective flushing patch, again from Kristian. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/xen-netfront.c Minor overlapping changes in xen-netfront.c, mostly to do with some buffer management changes alongside the split of stats into TX and RX. Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-13net: rename vlan_tx_* helpers since "tx" is misleading thereJiri Pirko
The same macros are used for rx as well. So rename it. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-12update ip-sysctl.txt documentation (v2)Ani Sinha
Update documentation to reflect the fact that /proc/sys/net/ipv4/route/max_size is no longer used for ipv4. Signed-off-by: Ani Sinha <ani@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-11doc: fix the compile fix of txtimestamp.cWillem de Bruijn
A fix to ipv6 structure definitions removed the now superfluous definition of in6_pktinfo in this file. But, use of the glibc definition requires defining _GNU_SOURCE (see also https://sourceware.org/bugzilla/show_bug.cgi?id=6775). Before this change, the following would fail for me: make make headers_install make M=Documentation/networking/timestamping with Documentation/networking/timestamping/txtimestamp.c: In function '__recv_errmsg_cmsg': Documentation/networking/timestamping/txtimestamp.c:205:33: error: dereferencing pointer to incomplete type Documentation/networking/timestamping/txtimestamp.c:206:23: error: dereferencing pointer to incomplete type After this patch compilation succeeded. Fixes: cd91cc5bdddf ("doc: fix the compile error of txtimestamp.c") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-01-08doc: fix the compile error of txtimestamp.cWANG Cong
Vinson reported: HOSTCC Documentation/networking/timestamping/txtimestamp Documentation/networking/timestamping/txtimestamp.c:64:8: error: redefinition of ‘struct in6_pktinfo’ struct in6_pktinfo { ^ In file included from /usr/include/arpa/inet.h:23:0, from Documentation/networking/timestamping/txtimestamp.c:33: /usr/include/netinet/in.h:456:8: note: originally defined here struct in6_pktinfo ^ After we sync with libc header, we don't need this ugly hack any more. Reported-by: Vinson Lee <vlee@twopensource.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-29Update of Documentation/networking/00-INDEXHenrik Austad
- altera_tse.txt was added by 04add4ab (Add Altera Ethernet (TSE) Documentation) - cdc_mbim.txt was added by a563babe (cdc_mbim: add driver documentation) - dctcp.txt was added by e3118e83 (tcp: add DCTCP congestion control algorithm) CC: Jonathan Corbet <corbet@lwn.net> CC: "David S. Miller" <davem@davemloft.net> CC: linux-doc@vger.kernel.org CC: linux-kernel@vger.kernel.org Signed-off-by: Henrik Austad <henrik@austad.us> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2014-12-23netfilter: conntrack: adjust nf_conntrack_buckets default valueMarcelo Leitner
Manually bumping either nf_conntrack_buckets or nf_conntrack_max has become a common task as our Linux servers tend to serve more and more clients/applications, so let's adjust nf_conntrack_buckets this to a more updated value. Now for systems with more than 4GB of memory, nf_conntrack_buckets becomes 65536 instead of 16384, resulting in nf_conntrack_max=256k entries. Signed-off-by: Marcelo Ricardo Leitner <mleitner@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-12-15fib_trie.txt: fix typoDuan Jiong
Fix the typo, there should be "It". On the other hand, fix whitespace errors detected by checkpatch.pl Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-09Documentation (ixgbe.txt): use a decimal address.Rami Rosen
This patch fixes the erronous usage of an hexadecimal address in the example, by replacing it with a decimal address. Signed-off-by: Rami Rosen <ramirose@gmail.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-08net-timestamp: expand documentation and testWillem de Bruijn
Documentation: expand explanation of timestamp counter Test: new: flag -I requests and prints PKTINFO new: flag -x prints payload (possibly truncated) fix: remove pretty print that breaks common flag '-l 1' Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-08net-timestamp: allow reading recv cmsg on errqueue with origin tstampWillem de Bruijn
Allow reading of timestamps and cmsg at the same time on all relevant socket families. One use is to correlate timestamps with egress device, by asking for cmsg IP_PKTINFO. on AF_INET sockets, call the relevant function (ip_cmsg_recv). To avoid changing legacy expectations, only do so if the caller sets a new timestamping flag SOF_TIMESTAMPING_OPT_CMSG. on AF_INET6 sockets, IPV6_PKTINFO and all other recv cmsg are already returned for all origins. only change is to set ifindex, which is not initialized for all error origins. In both cases, only generate the pktinfo message if an ifindex is known. This is not the case for ACK timestamps. The difference between the protocol families is probably a historical accident as a result of the different conditions for generating cmsg in the relevant ip(v6)_recv_error function: ipv4: if (serr->ee.ee_origin == SO_EE_ORIGIN_ICMP) { ipv6: if (serr->ee.ee_origin != SO_EE_ORIGIN_LOCAL) { At one time, this was the same test bar for the ICMP/ICMP6 distinction. This is no longer true. Signed-off-by: Willem de Bruijn <willemb@google.com> ---- Changes v1 -> v2 large rewrite - integrate with existing pktinfo cmsg generation code - on ipv4: only send with new flag, to maintain legacy behavior - on ipv6: send at most a single pktinfo cmsg - on ipv6: initialize fields if not yet initialized The recv cmsg interfaces are also relevant to the discussion of whether looping packet headers is problematic. For v6, cmsgs that identify many headers are already returned. This patch expands that to v4. If it sounds reasonable, I will follow with patches 1. request timestamps without payload with SOF_TIMESTAMPING_OPT_TSONLY (http://patchwork.ozlabs.org/patch/366967/) 2. sysctl to conditionally drop all timestamps that have payload or cmsg from users without CAP_NET_RAW. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-12-02net: introduce generic switch devices supportJiri Pirko
The goal of this is to provide a possibility to support various switch chips. Drivers should implement relevant ndos to do so. Now there is only one ndo defined: - for getting physical switch id is in place. Note that user can use random port netdevice to access the switch. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Reviewed-by: Thomas Graf <tgraf@suug.ch> Acked-by: Andy Gospodarek <gospo@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2014-11-25net-timestamp: Fix a documentation typoAndrew Lutomirski
SOF_TIMESTAMPING_OPT_ID puts the id in ee_data, not ee_info. Cc: Willem de Bruijn <willemb@google.com> Signed-off-by: Andy Lutomirski <luto@amacapital.net> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-24ipvlan: Initial check-in of the IPVLAN driver.Mahesh Bandewar
This driver is very similar to the macvlan driver except that it uses L3 on the frame to determine the logical interface while functioning as packet dispatcher. It inherits L2 of the master device hence the packets on wire will have the same L2 for all the packets originating from all virtual devices off of the same master device. This driver was developed keeping the namespace use-case in mind. Hence most of the examples given here take that as the base setup where main-device belongs to the default-ns and virtual devices are assigned to the additional namespaces. The device operates in two different modes and the difference in these two modes in primarily in the TX side. (a) L2 mode : In this mode, the device behaves as a L2 device. TX processing upto L2 happens on the stack of the virtual device associated with (namespace). Packets are switched after that into the main device (default-ns) and queued for xmit. RX processing is simple and all multicast, broadcast (if applicable), and unicast belonging to the address(es) are delivered to the virtual devices. (b) L3 mode : In this mode, the device behaves like a L3 device. TX processing upto L3 happens on the stack of the virtual device associated with (namespace). Packets are switched to the main-device (default-ns) for the L2 processing. Hence the routing table of the default-ns will be used in this mode. RX processins is somewhat similar to the L2 mode except that in this mode only Unicast packets are delivered to the virtual device while main-dev will handle all other packets. The devices can be added using the "ip" command from the iproute2 package - ip link add link <master> <virtual> type ipvlan mode [ l2 | l3 ] Signed-off-by: Mahesh Bandewar <maheshb@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Maciej Żenczykowski <maze@google.com> Cc: Laurent Chavey <chavey@google.com> Cc: Tim Hockin <thockin@google.com> Cc: Brandon Philips <brandon.philips@coreos.com> Cc: Pavel Emelianov <xemul@parallels.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-19stmmac: update driver documentationGiuseppe CAVALLARO
Recently many changes have been done inside the driver so this patch updates the driver's doc for example reviewing information for the rx and tx processes that are managed by napi method, adding new information for missing glue-logic files etc. Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-11-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2014-11-05net: Add missing descriptions for fwmark_reflect for ipv4 and ipv6.Loganaden Velvindron
It was initially sent by Lorenzo Colitti, but was subsequently lost in the final diff he submitted. Signed-off-by: Loganaden Velvindron <logan@elandsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-29net: ipv6: Add a sysctl to make optimistic addresses useful candidatesErik Kline
Add a sysctl that causes an interface's optimistic addresses to be considered equivalent to other non-deprecated addresses for source address selection purposes. Preferred addresses will still take precedence over optimistic addresses, subject to other ranking in the source address selection algorithm. This is useful where different interfaces are connected to different networks from different ISPs (e.g., a cell network and a home wifi network). The current behaviour complies with RFC 3484/6724, and it makes sense if the host has only one interface, or has multiple interfaces on the same network (same or cooperating administrative domain(s), but not in the multiple distinct networks case. For example, if a mobile device has an IPv6 address on an LTE network and then connects to IPv6-enabled wifi, while the wifi IPv6 address is undergoing DAD, IPv6 connections will try use the wifi default route with the LTE IPv6 address, and will get stuck until they time out. Also, because optimistic nodes can receive frames, issue an RTM_NEWADDR as soon as DAD starts (with the IFA_F_OPTIMSTIC flag appropriately set). A second RTM_NEWADDR is sent if DAD completes (the address flags have changed), otherwise an RTM_DELADDR is sent. Also: add an entry in ip-sysctl.txt for optimistic_dad. Signed-off-by: Erik Kline <ek@google.com> Acked-by: Lorenzo Colitti <lorenzo@google.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-29tcp: allow for bigger reordering levelEric Dumazet
While testing upcoming Yaogong patch (converting out of order queue into an RB tree), I hit the max reordering level of linux TCP stack. Reordering level was limited to 127 for no good reason, and some network setups [1] can easily reach this limit and get limited throughput. Allow a new max limit of 300, and add a sysctl to allow admins to even allow bigger (or lower) values if needed. [1] Aggregation of links, per packet load balancing, fabrics not doing deep packet inspections, alternative TCP congestion modules... Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Yaogong Wang <wygivan@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>