1. 07 Apr, 2014 2 commits
  2. 04 Mar, 2014 2 commits
  3. 19 Dec, 2013 3 commits
  4. 16 Dec, 2013 1 commit
    • Lorenzo Pieralisi's avatar
      arm64: kernel: build MPIDR_EL1 hash function data structure · 976d7d3f
      Lorenzo Pieralisi authored
      
      
      On ARM64 SMP systems, cores are identified by their MPIDR_EL1 register.
      The MPIDR_EL1 guidelines in the ARM ARM do not provide strict enforcement of
      MPIDR_EL1 layout, only recommendations that, if followed, split the MPIDR_EL1
      on ARM 64 bit platforms in four affinity levels. In multi-cluster
      systems like big.LITTLE, if the affinity guidelines are followed, the
      MPIDR_EL1 can not be considered a linear index. This means that the
      association between logical CPU in the kernel and the HW CPU identifier
      becomes somewhat more complicated requiring methods like hashing to
      associate a given MPIDR_EL1 to a CPU logical index, in order for the look-up
      to be carried out in an efficient and scalable way.
      
      This patch provides a function in the kernel that starting from the
      cpu_logical_map, implement collision-free hashing of MPIDR_EL1 values by
      checking all significative bits of MPIDR_EL1 affinity level bitfields.
      The hashing can then be carried out through bits shifting and ORing; the
      resulting hash algorithm is a collision-free though not minimal hash that can
      be executed with few assembly instructions. The mpidr_el1 is filtered through a
      mpidr mask that is built by checking all bits that toggle in the set of
      MPIDR_EL1s corresponding to possible CPUs. Bits that do not toggle do not
      carry information so they do not contribute to the resulting hash.
      
      Pseudo code:
      
      /* check all bits that toggle, so they are required */
      for (i = 1, mpidr_el1_mask = 0; i < num_possible_cpus(); i++)
      	mpidr_el1_mask |= (cpu_logical_map(i) ^ cpu_logical_map(0));
      
      /*
       * Build shifts to be applied to aff0, aff1, aff2, aff3 values to hash the
       * mpidr_el1
       * fls() returns the last bit set in a word, 0 if none
       * ffs() returns the first bit set in a word, 0 if none
       */
      fs0 = mpidr_el1_mask[7:0] ? ffs(mpidr_el1_mask[7:0]) - 1 : 0;
      fs1 = mpidr_el1_mask[15:8] ? ffs(mpidr_el1_mask[15:8]) - 1 : 0;
      fs2 = mpidr_el1_mask[23:16] ? ffs(mpidr_el1_mask[23:16]) - 1 : 0;
      fs3 = mpidr_el1_mask[39:32] ? ffs(mpidr_el1_mask[39:32]) - 1 : 0;
      ls0 = fls(mpidr_el1_mask[7:0]);
      ls1 = fls(mpidr_el1_mask[15:8]);
      ls2 = fls(mpidr_el1_mask[23:16]);
      ls3 = fls(mpidr_el1_mask[39:32]);
      bits0 = ls0 - fs0;
      bits1 = ls1 - fs1;
      bits2 = ls2 - fs2;
      bits3 = ls3 - fs3;
      aff0_shift = fs0;
      aff1_shift = 8 + fs1 - bits0;
      aff2_shift = 16 + fs2 - (bits0 + bits1);
      aff3_shift = 32 + fs3 - (bits0 + bits1 + bits2);
      u32 hash(u64 mpidr_el1) {
      	u32 l[4];
      	u64 mpidr_el1_masked = mpidr_el1 & mpidr_el1_mask;
      	l[0] = mpidr_el1_masked & 0xff;
      	l[1] = mpidr_el1_masked & 0xff00;
      	l[2] = mpidr_el1_masked & 0xff0000;
      	l[3] = mpidr_el1_masked & 0xff00000000;
      	return (l[0] >> aff0_shift | l[1] >> aff1_shift | l[2] >> aff2_shift |
      		l[3] >> aff3_shift);
      }
      
      The hashing algorithm relies on the inherent properties set in the ARM ARM
      recommendations for the MPIDR_EL1. Exotic configurations, where for instance
      the MPIDR_EL1 values at a given affinity level have large holes, can end up
      requiring big hash tables since the compression of values that can be achieved
      through shifting is somewhat crippled when holes are present. Kernel warns if
      the number of buckets of the resulting hash table exceeds the number of
      possible CPUs by a factor of 4, which is a symptom of a very sparse HW
      MPIDR_EL1 configuration.
      
      The hash algorithm is quite simple and can easily be implemented in assembly
      code, to be used in code paths where the kernel virtual address space is
      not set-up (ie cpu_resume) and instruction and data fetches are strongly
      ordered so code must be compact and must carry out few data accesses.
      Signed-off-by: default avatarLorenzo Pieralisi <lorenzo.pieralisi@arm.com>
      976d7d3f
  5. 25 Nov, 2013 1 commit
  6. 30 Oct, 2013 1 commit
  7. 25 Oct, 2013 2 commits
  8. 09 Oct, 2013 3 commits
  9. 26 Sep, 2013 1 commit
  10. 20 Sep, 2013 1 commit
  11. 30 Aug, 2013 1 commit
  12. 28 Aug, 2013 1 commit
  13. 14 May, 2013 1 commit
  14. 20 Mar, 2013 2 commits
  15. 29 Jan, 2013 1 commit
  16. 22 Jan, 2013 1 commit
  17. 18 Oct, 2012 1 commit
  18. 17 Sep, 2012 1 commit