1. 15 Oct, 2007 4 commits
  2. 10 Oct, 2007 10 commits
    • Denis V. Lunev's avatar
      [NET]: make netlink user -> kernel interface synchronious · cd40b7d3
      Denis V. Lunev authored
      
      
      This patch make processing netlink user -> kernel messages synchronious.
      This change was inspired by the talk with Alexey Kuznetsov about current
      netlink messages processing. He says that he was badly wrong when introduced 
      asynchronious user -> kernel communication.
      
      The call netlink_unicast is the only path to send message to the kernel
      netlink socket. But, unfortunately, it is also used to send data to the
      user.
      
      Before this change the user message has been attached to the socket queue
      and sk->sk_data_ready was called. The process has been blocked until all
      pending messages were processed. The bad thing is that this processing
      may occur in the arbitrary process context.
      
      This patch changes nlk->data_ready callback to get 1 skb and force packet
      processing right in the netlink_unicast.
      
      Kernel -> user path in netlink_unicast remains untouched.
      
      EINTR processing for in netlink_run_queue was changed. It forces rtnl_lock
      drop, but the process remains in the cycle until the message will be fully
      processed. So, there is no need to use this kludges now.
      Signed-off-by: default avatarDenis V. Lunev <den@openvz.org>
      Acked-by: default avatarAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      cd40b7d3
    • Pavel Emelyanov's avatar
      [NETFILTER]: Make netfilter code use the seq_open_private · e2da5913
      Pavel Emelyanov authored
      
      
      Just switch to the consolidated calls.
      
      ipt_recent() has to initialize the private, so use
      the __seq_open_private() helper.
      Signed-off-by: default avatarPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e2da5913
    • Patrick McHardy's avatar
      f73e924c
    • Patrick McHardy's avatar
      [NETFILTER]: nfnetlink: rename functions containing 'nfattr' · fdf70832
      Patrick McHardy authored
      
      
      There is no struct nfattr anymore, rename functions to 'nlattr'.
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      fdf70832
    • Patrick McHardy's avatar
      [NETFILTER]: nfnetlink: convert to generic netlink attribute functions · df6fb868
      Patrick McHardy authored
      
      
      Get rid of the duplicated rtnetlink macros and use the generic netlink
      attribute functions. The old duplicated stuff is moved to a new header
      file that exists just for userspace.
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      df6fb868
    • Stephen Hemminger's avatar
      [NET]: Wrap hard_header_parse · b95cce35
      Stephen Hemminger authored
      
      
      Wrap the hard_header_parse function to simplify next step of
      header_ops conversion.
      Signed-off-by: default avatarStephen Hemminger <shemminger@linux-foundation.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b95cce35
    • Eric W. Biederman's avatar
      [NET]: Make the device list and device lookups per namespace. · 881d966b
      Eric W. Biederman authored
      
      
      This patch makes most of the generic device layer network
      namespace safe.  This patch makes dev_base_head a
      network namespace variable, and then it picks up
      a few associated variables.  The functions:
      dev_getbyhwaddr
      dev_getfirsthwbytype
      dev_get_by_flags
      dev_get_by_name
      __dev_get_by_name
      dev_get_by_index
      __dev_get_by_index
      dev_ioctl
      dev_ethtool
      dev_load
      wireless_process_ioctl
      
      were modified to take a network namespace argument, and
      deal with it.
      
      vlan_ioctl_set and brioctl_set were modified so their
      hooks will receive a network namespace argument.
      
      So basically anthing in the core of the network stack that was
      affected to by the change of dev_base was modified to handle
      multiple network namespaces.  The rest of the network stack was
      simply modified to explicitly use &init_net the initial network
      namespace.  This can be fixed when those components of the network
      stack are modified to handle multiple network namespaces.
      
      For now the ifindex generator is left global.
      
      Fundametally ifindex numbers are per namespace, or else
      we will have corner case problems with migration when
      we get that far.
      
      At the same time there are assumptions in the network stack
      that the ifindex of a network device won't change.  Making
      the ifindex number global seems a good compromise until
      the network stack can cope with ifindex changes when
      you change namespaces, and the like.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      881d966b
    • Eric W. Biederman's avatar
      [NET]: Support multiple network namespaces with netlink · b4b51029
      Eric W. Biederman authored
      
      
      Each netlink socket will live in exactly one network namespace,
      this includes the controlling kernel sockets.
      
      This patch updates all of the existing netlink protocols
      to only support the initial network namespace.  Request
      by clients in other namespaces will get -ECONREFUSED.
      As they would if the kernel did not have the support for
      that netlink protocol compiled in.
      
      As each netlink protocol is updated to be multiple network
      namespace safe it can register multiple kernel sockets
      to acquire a presence in the rest of the network namespaces.
      
      The implementation in af_netlink is a simple filter implementation
      at hash table insertion and hash table look up time.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      b4b51029
    • Eric W. Biederman's avatar
      [NET]: Make device event notification network namespace safe · e9dc8653
      Eric W. Biederman authored
      
      
      Every user of the network device notifiers is either a protocol
      stack or a pseudo device.  If a protocol stack that does not have
      support for multiple network namespaces receives an event for a
      device that is not in the initial network namespace it quite possibly
      can get confused and do the wrong thing.
      
      To avoid problems until all of the protocol stacks are converted
      this patch modifies all netdev event handlers to ignore events on
      devices that are not in the initial network namespace.
      
      As the rest of the code is made network namespace aware these
      checks can be removed.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      e9dc8653
    • Eric W. Biederman's avatar
      [NET]: Make /proc/net per network namespace · 457c4cbc
      Eric W. Biederman authored
      
      
      This patch makes /proc/net per network namespace.  It modifies the global
      variables proc_net and proc_net_stat to be per network namespace.
      The proc_net file helpers are modified to take a network namespace argument,
      and all of their callers are fixed to pass &init_net for that argument.
      This ensures that all of the /proc/net files are only visible and
      usable in the initial network namespace until the code behind them
      has been updated to be handle multiple network namespaces.
      
      Making /proc/net per namespace is necessary as at least some files
      in /proc/net depend upon the set of network devices which is per
      network namespace, and even more files in /proc/net have contents
      that are relevant to a single network namespace.
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      457c4cbc
  3. 11 Sep, 2007 2 commits
    • Neil Horman's avatar
      [NETFILTER]: Fix/improve deadlock condition on module removal netfilter · 16fcec35
      Neil Horman authored
      
      
      So I've had a deadlock reported to me.  I've found that the sequence of
      events goes like this:
      
      1) process A (modprobe) runs to remove ip_tables.ko
      
      2) process B (iptables-restore) runs and calls setsockopt on a netfilter socket,
      increasing the ip_tables socket_ops use count
      
      3) process A acquires a file lock on the file ip_tables.ko, calls remove_module
      in the kernel, which in turn executes the ip_tables module cleanup routine,
      which calls nf_unregister_sockopt
      
      4) nf_unregister_sockopt, seeing that the use count is non-zero, puts the
      calling process into uninterruptible sleep, expecting the process using the
      socket option code to wake it up when it exits the kernel
      
      4) the user of the socket option code (process B) in do_ipt_get_ctl, calls
      ipt_find_table_lock, which in this case calls request_module to load
      ip_tables_nat.ko
      
      5) request_module forks a copy of modprobe (process C) to load the module and
      blocks until modprobe exits.
      
      6) Process C. forked by request_module process the dependencies of
      ip_tables_nat.ko, of which ip_tables.ko is one.
      
      7) Process C attempts to lock the request module and all its dependencies, it
      blocks when it attempts to lock ip_tables.ko (which was previously locked in
      step 3)
      
      Theres not really any great permanent solution to this that I can see, but I've
      developed a two part solution that corrects the problem
      
      Part 1) Modifies the nf_sockopt registration code so that, instead of using a
      use counter internal to the nf_sockopt_ops structure, we instead use a pointer
      to the registering modules owner to do module reference counting when nf_sockopt
      calls a modules set/get routine.  This prevents the deadlock by preventing set 4
      from happening.
      
      Part 2) Enhances the modprobe utilty so that by default it preforms non-blocking
      remove operations (the same way rmmod does), and add an option to explicity
      request blocking operation.  So if you select blocking operation in modprobe you
      can still cause the above deadlock, but only if you explicity try (and since
      root can do any old stupid thing it would like....  :)  ).
      Signed-off-by: default avatarNeil Horman <nhorman@tuxdriver.com>
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      16fcec35
    • Patrick McHardy's avatar
      [NETFILTER]: nf_conntrack_ipv4: fix "Frag of proto ..." messages · 0fb96701
      Patrick McHardy authored
      
      
      Since we're now using a generic tuple decoding function in ICMP
      connection tracking, ipv4_get_l4proto() might get called with a
      fragmented packet from within an ICMP error. Remove the error
      message we used to print when this happens.
      Signed-off-by: default avatarPatrick McHardy <kaber@trash.net>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      0fb96701
  4. 14 Aug, 2007 1 commit
  5. 13 Aug, 2007 1 commit
  6. 07 Aug, 2007 2 commits
  7. 02 Aug, 2007 1 commit
  8. 26 Jul, 2007 1 commit
  9. 24 Jul, 2007 1 commit
  10. 19 Jul, 2007 1 commit
    • Yoann Padioleau's avatar
      some kmalloc/memset ->kzalloc (tree wide) · dd00cc48
      Yoann Padioleau authored
      
      
      Transform some calls to kmalloc/memset to a single kzalloc (or kcalloc).
      
      Here is a short excerpt of the semantic patch performing
      this transformation:
      
      @@
      type T2;
      expression x;
      identifier f,fld;
      expression E;
      expression E1,E2;
      expression e1,e2,e3,y;
      statement S;
      @@
      
       x =
      - kmalloc
      + kzalloc
        (E1,E2)
        ...  when != \(x->fld=E;\|y=f(...,x,...);\|f(...,x,...);\|x=E;\|while(...) S\|for(e1;e2;e3) S\)
      - memset((T2)x,0,E1);
      
      @@
      expression E1,E2,E3;
      @@
      
      - kzalloc(E1 * E2,E3)
      + kcalloc(E1,E2,E3)
      
      [akpm@linux-foundation.org: get kcalloc args the right way around]
      Signed-off-by: default avatarYoann Padioleau <padator@wanadoo.fr>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Acked-by: default avatarRussell King <rmk@arm.linux.org.uk>
      Cc: Bryan Wu <bryan.wu@analog.com>
      Acked-by: default avatarJiri Slaby <jirislaby@gmail.com>
      Cc: Dave Airlie <airlied@linux.ie>
      Acked-by: default avatarRoland Dreier <rolandd@cisco.com>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Acked-by: Dmitry Torokhov <dtor@mail.ru...
      dd00cc48
  11. 14 Jul, 2007 5 commits
  12. 11 Jul, 2007 1 commit
  13. 10 Jul, 2007 10 commits