1. 24 Feb, 2017 1 commit
  2. 23 Feb, 2017 1 commit
  3. 22 Feb, 2017 1 commit
  4. 17 Feb, 2017 1 commit
  5. 15 Feb, 2017 1 commit
    • nickcooper-zhangtonghao's avatar
      ofproto/bond: Validate active-slave mac. · 49dd1141
      nickcooper-zhangtonghao authored
      That the mac of active-slave is invalid(e.g. 00:00:00:00:00:00)
      is incidental. The reason is described as below.
      
      In the bridge_reconfig():
      1. bond devices created in port_configure().
      2. the bonded interfaces may be disabled even calling bridge_run__(),
         because the interface link is not ready.
      
      The OvS will run bridge_run__() in next loop. In next loop, the
      active-slave may be selected. But OvS the bridge_reconfig() again,
      the bond_reconfigure() set active-slave mac zero and flag false.
      If using the 'ovs-appctl bond/show bond-name' to check active-slave
      mac, you will find the mac is zero and mac in the ovsdb is also zero.
      
      The active_slave_mac and active_slave_changed should be initialized
      when created.
      Signed-off-by: default avatarnickcooper-zhangtonghao <nic@opencloud.tech>
      Signed-off-by: default avatarAndy Zhou <azhou@ovn.org>
      49dd1141
  6. 08 Feb, 2017 1 commit
  7. 01 Feb, 2017 1 commit
  8. 31 Jan, 2017 1 commit
  9. 20 Jan, 2017 2 commits
  10. 17 Jan, 2017 1 commit
  11. 12 Jan, 2017 1 commit
  12. 23 Dec, 2016 2 commits
    • Ben Pfaff's avatar
      rconn: Avoid abort for ill-behaved remote. · a77b1d99
      Ben Pfaff authored
      If an rconn peer fails to send a hello message, the version number doesn't
      get set.  Later, if the peer delays long enough, the rconn attempts to send
      an echo request but assert-fails instead because it doesn't know what
      version to use.  This fixes the problem.
      
      To reproduce this problem:
      
          make sandbox
          ovs-vsctl add-br br0
          ovs-vsctl set-controller br0 ptcp:12345
          nc 127.0.0.1 12345
      
      and wait 10 seconds for ovs-vswitchd to die.  (Then exit the sandbox.)
      Reported-by: default avatar张东亚 <fortitude.zhang@gmail.com>
      Signed-off-by: default avatarBen Pfaff <blp@ovn.org>
      Acked-by: default avatarJustin Pettit <jpettit@ovn.org>
      a77b1d99
    • Ben Pfaff's avatar
      lacp: Select a may-enable IF as the lead IF · 929f24fb
      Ben Pfaff authored
      A reboot of one switch in an MC-LAG bond makes all bond links
      to go down, causing a total connectivity loss for 3 seconds.
      
      Packet capture shows that spurious LACP PDUs are sent to OVS with
      a different MAC address (partner system id) during the final
      stages of the MC-LAG switch reboot.
      
      The current code selects a lead interface based on information
      in the LACP PDU, regardless of its synchronization state. If a
      non-synchronized interface is selected as the OVS lead interface
      then all other interfaces are forced down as their stored partner
      system id differs and the bond ends up with no working interface.
      The bond recovers within three seconds after the last spurious
      message.
      
      To avoid the problem, this commit requires a lead interface
      to be synchronized. In case no synchronized interface exists,
      the selection of lead interface is done as in the current code.
      Signed-off-by: default avatarTorgny Lindberg <torgny.lindberg@ericsson.com>
      Signed-off-by: default avatarBen Pfaff <blp@ovn.org>
      929f24fb
  13. 10 Dec, 2016 1 commit
    • Ben Pfaff's avatar
      ofproto-dpif-ipfix: Fix assertion failure for bad configuration. · 54c7a1d3
      Ben Pfaff authored
      The assertions in dpif_ipfix_set_options() made some bad assumptions about
      flow exporters.  The code that added and removed exporters would add a flow
      exporter even if it had an invalid configuration ("broken"), but the
      assertions checked that broken flow exporters were not added.  Thus, the
      when a flow exporter was broken, ovs-vswitchd would crash due to an
      assertion failure.
      
      Here is an example vsctl command that, run in the sandbox, would crash
      ovs-vswitchd:
      
          ovs-vsctl \
              -- add-br br0 \
              -- --id=@br0 get bridge br0 \
              -- --id=@ipfix create ipfix target='["xyzzy"]' \
              -- create flow_sample_collector_set id=1 bridge=@br0 ipfix=@ipfix
      
      The minimal fix would be to remove the assertions, but this would leave
      broken flow exporters in place.  This commit goes a little farther and
      actually removes broken flow exporters.
      
      This fix pulls code out of an "if" statement to a higher level, so it is a
      smaller fix when viewed igoring space changes.
      
      This bug dates back to the introduction of IPFIX in 2013.
      
      VMware-BZ: #1779123
      CC: Romain Lenglet <romain.lenglet@berabera.info>
      Fixes: 29089a54 ("Implement IPFIX export")
      Signed-off-by: default avatarBen Pfaff <blp@ovn.org>
      Acked-by: default avatarJarno Rajahalme <jarno@ovn.org>
      54c7a1d3
  14. 09 Dec, 2016 3 commits
    • Ilya Maximets's avatar
      netdev-dpdk: Use instant sending instead of queueing of packets. · a1b2fc05
      Ilya Maximets authored
      Current implementarion of TX packet's queueing is broken in several ways:
      
      	* TX queue flushing implemented on receive assumes that all
      	  core_id-s are sequential and starts from zero. This may lead
      	  to situation when packets will stuck in queue forever and,
      	  also, this influences on latency.
      
      	* For a long time flushing logic depends on uninitialized
      	  'txq_needs_locking', because it usually calculated after
      	  'netdev_dpdk_alloc_txq' but used inside of this function
      	  for initialization of 'flush_tx'.
      
      Testing shows no performance difference with and without queueing.
      Lets remove queueing at all because it doesn't work properly now and
      also does not increase performance.
      
      This improves latency compared to current openvswitch 2.5.
      
      Without the patch:
      
      Device 0->1:
        Throughput: 6.8683 Mpps
        Min. Latency: 39.9780 usec
        Avg. Latency: 61.1226 usec
        Max. Latency: 89.1110 usec
      
      Device 1->0:
        Throughput: 6.8683 Mpps
        Min. Latency: 41.0660 usec
        Avg. Latency: 58.7778 usec
        Max. Latency: 89.4720 usec
      
      With the patch:
      
      Device 0->1:
        Throughput: 6.3941 Mpps
        Min. Latency: 10.5410 usec
        Avg. Latency: 14.1309 usec
        Max. Latency: 28.9880 usec
      
      Device 1->0:
        Throughput: 6.3941 Mpps
        Min. Latency: 11.9780 usec
        Avg. Latency: 18.0692 usec
        Max. Latency: 29.5200
      Signed-off-by: default avatarIlya Maximets <i.maximets@samsung.com>
      Acked-by: default avatarDaniele Di Proietto <diproiettod@vmware.com>
      Signed-off-by: default avatarThadeu Lima de Souza Cascardo <cascardo@redhat.com>
      a1b2fc05
    • Daniele Di Proietto's avatar
      csum: Fix csum_continue() on big endian with an odd number of bytes. · 8aec4abe
      Daniele Di Proietto authored
      Even though it reads 16 bits at a time, csum_continue() is almost
      neutral to endianness (see RFC 1071 1.2 (B), "Byte Order Independence").
      
      Consider a buffer like the following:
      
      00000000: XX YY XX YY XX YY XX YY ZZ
      
      Each couple of bytes is interpreted on little endian as:
      
      *data = 0xYYXX
      
      while on big endian
      
      *data = 0xXXYY
      
      The last byte "ZZ" should be treated as the two bytes "ZZ 00"
      little endian:
      
      *data = 0x00ZZ
      
      big endian:
      
      *data = 0xZZ00
      
      which means that the last byte (for odd buffers) should be left shifted
      by 8 bits on big endian platforms.
      
      This fixes a couple of connection tracking tests in userspace for big
      endian platforms.
      
      I guess RFC1071 4.1 (implementation example of the checksum in C), would
      manifest the same problem on big endian.
      
      Reported-at: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=840770Signed-off-by: default avatarDaniele Di Proietto <diproiettod@vmware.com>
      Acked-by: default avatarJarno Rajahalme <jarno@ovn.org>
      8aec4abe
    • nickcooper-zhangtonghao's avatar
      ovs-vswitchd: Avoid segfault for "netdev" datapath. · 04db493d
      nickcooper-zhangtonghao authored
      When the datapath, whose type is "netdev", processes packets
      in userspce action, it may cause a segmentation fault. In the
      dp_execute_userspace_action(), we pass the "wc" argument to
      dp_netdev_upcall() using NULL. In the dp_netdev_upcall() call tree,
      the "wc" will be used. For example, dp_netdev_upcall() uses the
      &wc->masks for debugging, and flow_wildcards_init_for_packet()
      uses the  "wc" if we disable megaflow, which is described in
      more detail below.
      
      Segmentation fault in flow_wildcards_init_for_packet:
      
          #0  0x0000000000468fe8 flow_wildcards_init_for_packet lib/flow.c:1275
          #1  0x0000000000436c0b upcall_cb ofproto/ofproto-dpif-upcall.c:1231
          #2  0x000000000045bd96 dp_netdev_upcall lib/dpif-netdev.c:3857
          #3  0x0000000000461bf3 dp_execute_userspace_action lib/dpif-netdev.c:4388
          #4  dp_execute_cb lib/dpif-netdev.c:4521
          #5  0x0000000000486ae2 odp_execute_actions lib/odp-execute.c:538
          #6  0x00000000004607f9 dp_netdev_execute_actions lib/dpif-netdev.c:4627
          #7  packet_batch_per_flow_execute lib/dpif-netdev.c:3927
          #8  dp_netdev_input__ lib/dpif-netdev.c:4229
          #9  0x0000000000460ba8 dp_netdev_input lib/dpif-netdev.c:4238
          #10 dp_netdev_process_rxq_port lib/dpif-netdev.c:2873
          #11 0x000000000046126e dpif_netdev_run lib/dpif-netdev.c:3000
          #12 0x000000000042baf5 type_run ofproto/ofproto-dpif.c:504
          #13 0x00000000004192bf ofproto_type_run ofproto/ofproto.c:1687
          #14 0x0000000000409965 bridge_run__ vswitchd/bridge.c:2875
          #15 0x000000000040f145 bridge_run vswitchd/bridge.c:2938
          #16 0x00000000004062e5 main vswitchd/ovs-vswitchd.c:111
      Signed-off-by: default avatarnickcooper-zhangtonghao <nic@opencloud.tech>
      Signed-off-by: default avatarDaniele Di Proietto <diproiettod@vmware.com>
      04db493d
  15. 06 Dec, 2016 2 commits
  16. 05 Dec, 2016 2 commits
    • Jarno Rajahalme's avatar
      ofproto-dpif: Always forward 'used' from the old_rule. · 93cbf81f
      Jarno Rajahalme authored
      Use new rule's flags to determine whether stats should be forwarded
      from the old, modified rule to the new rule.  This captures the fact
      that prior to OpenFlow 1.2, which defines the reset counts flag, the
      reset counts semantics was assumed by default.  However, in that case
      the reset counts flag is only present in the new flow, not on the
      corresponding flow mod.
      
      Having the above fixed revealed that the 'used' timestamp was not
      forwarded from the old rule to the new rule when counts were not being
      forwarded.  Fix this by always forwarding the 'used' timestamp.
      
      This is a cherry-pick that squashes in a fix for the original patch.
      
      Fixes: 39c94593 ("Use classifier versioning.")
      Signed-off-by: default avatarJarno Rajahalme <jarno@ovn.org>
      Acked-by: default avatarBen Pfaff <blp@ovn.org>
      93cbf81f
    • Jarno Rajahalme's avatar
      mpls: Fix MPLS restoration after patch port and group bucket. · d8540e7b
      Jarno Rajahalme authored
      This patch fixes problems with MPLS handling related to patch ports
      and group buckets.
      
      If a group bucket or a peer bridge across a patch port pushes MPLS
      headers to a non-MPLS packet and outputs, the flow translation after
      returning from the group bucket or patch port would undo the packet
      transformations so that the processing could continue with the packet
      as it was before entering the patch port.  There were two problems
      with this:
      
      1. As part of the first MPLS push on a non-MPLS packet, the flow
      translation would first clear the L3/4 headers of the 'flow' to mark
      those fields invalid.  Later, when committing 'flow' changes to
      datapath actions before output, the necessary datapath MPLS actions
      are created and the corresponding changes updated to the 'base flow'.
      This was done using the same flow_push_mpls() function that clears
      the L2/3 headers, so also the 'base flow' L2/3 headers were cleared.
      
      Then, when translation returns from a patch port or group bucket, the
      original 'flow' is restored, now showing no sign of the MPLS labels.
      Since the 'base flow' now has the MPLS labels, following translations
      know to issue MPLS POP actions before any output actions.  However, as
      part of checking for changes to IP headers we test that the IP
      protocol type was not changed.  But now the 'base flow's 'nw_proto'
      field is zero and an assert fail crashes OVS.
      
      This is solved by not clearing the L3/4 fields of the 'base
      flow'. This allows the processing after the patch port to continue
      with L3/4 fields as if no MPLS was done, after first issuing the
      necessary MPLS POP actions.
      
      2. IP header updates were done before the MPLS POP actions were
      issued. This caused incorrect packet output after, e.g., group action
      or patch port.  For example, with actions:
      
      group 1234: all bucket=push_mpls,output:LOCAL
      
      ip actions=group:1234,dec_ttl,output:LOCAL,output:LOCAL
      
      the dec_ttl would only be executed before the last output to LOCAL,
      since at the time of committing IP changes after the group action the
      packet was still an MPLS packet.
      
      This is solved by checking the dl_type of both 'flow' and 'base flow'
      and issuing MPLS actions if they can transform the packet from an MPLS
      packet to a non-MPLS packet.  For an IP packet the change in ttl can
      then be correctly committed before the last two output actions.
      
      Two test cases are added to prevent future regressions.
      Reported-by: default avatarThomas Morin <thomas.morin@orange.com>
      Suggested-by: default avatarTakashi YAMAMOTO <yamamoto@ovn.org>
      Fixes: 8bfd0fda ("Enhance userspace support for MPLS, for up to 3 labels.")
      Fixes: 1b035ef2 ("mpls: Allow l3 and l4 actions to prior to a push_mpls action")
      Signed-off-by: default avatarJarno Rajahalme <jarno@ovn.org>
      Acked-by: default avatarYAMAMOTO Takashi <yamamoto@ovn.org>
      d8540e7b
  17. 24 Nov, 2016 1 commit
  18. 22 Nov, 2016 1 commit
  19. 17 Nov, 2016 2 commits
  20. 14 Nov, 2016 1 commit
  21. 20 Oct, 2016 1 commit
  22. 19 Oct, 2016 1 commit
  23. 17 Oct, 2016 1 commit
  24. 15 Oct, 2016 7 commits
  25. 13 Oct, 2016 2 commits
    • Daniele Di Proietto's avatar
      netdev-dpdk.h: Add missing copyright. · dbb92198
      Daniele Di Proietto authored
      Looks like we forgot to add the copyright headers to netdev-dpdk.h.
      Looking at the contribution history of the file, this commit adds the
      header with Nicira copyright.
      Signed-off-by: default avatarDaniele Di Proietto <diproiettod@vmware.com>
      Acked-by: default avatarBen Pfaff <blp@ovn.org>
      Acked-by: default avatarAaron Conole <aconole@redhat.com>
      Tested-by: default avatarAaron Conole <aconole@redhat.com>
      dbb92198
    • Daniele Di Proietto's avatar
      dpif-netdev: Fix crash in dpif_netdev_execute(). · ebd33e6f
      Daniele Di Proietto authored
      dp_netdev_get_pmd() is allowed to return NULL (even if we call it with
      NON_PMD_CORE_ID) for different reasons:
      
      * Since we use RCU to protect pmd threads, it is possible that
        ovs_refcount_try_ref_rcu() has failed.
      * During reconfiguration we destroy every thread.
      
      This commit makes sure that we always handle the case when
      dp_netdev_get_pmd() returns NULL without crashing.
      
      This actually fixes a pretty serious crash that happens if
      dpif_netdev_execute() is called from a non pmd thread while
      reconfiguration is happening.  It can be triggered by enabling bfd
      (because it's handled by the monitor thread, which is a non pmd thread)
      on an interface and changing something that requires datapath
      reconfiguration (n_rxq, pmd-cpu-mask).
      
      A testcase that reproduces the race condition is included.
      
      This is a possible backtrace of the segfault:
      
       #0  0x000000000060c7f1 in dp_execute_cb (aux_=0x7f1dd2d2a320,
       packets_=0x7f1dd2d2a370, a=0x7f1dd2d2a658, may_steal=false) at
       ../lib/dpif-netdev.c:4357
       #1  0x00000000006448b2 in odp_execute_actions (dp=0x7f1dd2d2a320,
       batch=0x7f1dd2d2a370, steal=false, actions=0x7f1dd2d2a658,
       actions_len=8,
           dp_execute_action=0x60c7a5 <dp_execute_cb>) at
       ../lib/odp-execute.c:538
       #2  0x000000000060d00c in dp_netdev_execute_actions (pmd=0x0,
       packets=0x7f1dd2d2a370, may_steal=false, flow=0x7f1dd2d2ae70,
       actions=0x7f1dd2d2a658, actions_len=8,
           now=44965873) at ../lib/dpif-netdev.c:4577
       #3  0x000000000060834a in dpif_netdev_execute (dpif=0x2b67b70,
       execute=0x7f1dd2d2a578) at ../lib/dpif-netdev.c:2624
       #4  0x0000000000608441 in dpif_netdev_operate (dpif=0x2b67b70,
       ops=0x7f1dd2d2a5c8, n_ops=1) at ../lib/dpif-netdev.c:2654
       #5  0x0000000000610a30 in dpif_operate (dpif=0x2b67b70,
       ops=0x7f1dd2d2a5c8, n_ops=1) at ../lib/dpif.c:1268
       #6  0x000000000061098c in dpif_execute (dpif=0x2b67b70,
       execute=0x7f1dd2d2aa50) at ../lib/dpif.c:1233
       #7  0x00000000005b9008 in ofproto_dpif_execute_actions__
       (ofproto=0x2b69360, version=18446744073709551614, flow=0x7f1dd2d2ae70,
       rule=0x0, ofpacts=0x7f1dd2d2b100,
           ofpacts_len=16, indentation=0, depth=0, resubmits=0,
       packet=0x7f1dd2d2b5c0) at ../ofproto/ofproto-dpif.c:3806
       #8  0x00000000005b907a in ofproto_dpif_execute_actions
       (ofproto=0x2b69360, version=18446744073709551614, flow=0x7f1dd2d2ae70,
       rule=0x0, ofpacts=0x7f1dd2d2b100,
           ofpacts_len=16, packet=0x7f1dd2d2b5c0) at
       ../ofproto/ofproto-dpif.c:3823
       #9  0x00000000005dea9b in xlate_send_packet (ofport=0x2b98380,
       oam=false, packet=0x7f1dd2d2b5c0) at
       ../ofproto/ofproto-dpif-xlate.c:5792
       #10 0x00000000005bab12 in ofproto_dpif_send_packet (ofport=0x2b98380,
       oam=false, packet=0x7f1dd2d2b5c0) at ../ofproto/ofproto-dpif.c:4628
       #11 0x00000000005c3fc8 in monitor_mport_run (mport=0x2b8cd00,
       packet=0x7f1dd2d2b5c0) at ../ofproto/ofproto-dpif-monitor.c:287
       #12 0x00000000005c3d9b in monitor_run () at
       ../ofproto/ofproto-dpif-monitor.c:227
       #13 0x00000000005c3cab in monitor_main (args=0x0) at
       ../ofproto/ofproto-dpif-monitor.c:189
       #14 0x00000000006a183a in ovsthread_wrapper (aux_=0x2b8afd0) at
       ../lib/ovs-thread.c:342
       #15 0x00007f1dd75eb444 in start_thread (arg=0x7f1dd2d2c700) at
       pthread_create.c:333
       #16 0x00007f1dd6e1d20d in clone () at
       ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
      Signed-off-by: default avatarDaniele Di Proietto <diproiettod@vmware.com>
      Acked-by: default avatarBen Pfaff <blp@ovn.org>
      ebd33e6f
  26. 04 Oct, 2016 1 commit