diff --git a/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/extending_volk.dox b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/extending_volk.dox new file mode 100644 index 000000000..4a8b86385 --- /dev/null +++ b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/extending_volk.dox @@ -0,0 +1,92 @@ +/*! \page extending_volk Extending VOLK + +There are two primary routes for extending VOLK for your own use. The +preferred route is by writing kernels and proto-kernels as part of this +repository and sending patches upstream. The alternative is creating your +own VOLK module, as it is the case of VOLK_GNSSSDR ;-). There is a good reason +for that: to provide GNSS-SDR users with adequate protokernels as soon as possible, +without needing to upgrade to the latest VOLK version to enjoy the benefits. +Notwithstanding, some of VOLK_GNSSSDR can be integrated into VOLK in the future, +if other users find them useful. + +## Modifying this repository + +### Adding kernels + +Adding kernels refers to introducing a new function to the VOLK API that is +presumably a useful math function/operation. The first step is to create +the file in volk/kernels/volk. Follow the naming scheme provided in the +VOLK terms and techniques page. First create the generic protokernel. + +The generic protokernel should be written in plain C using explicitly sized +types from stdint.h or volk_complex.h when appropriate. volk_complex.h +includes explicitly sized complex types for floats and ints. The name of +the generic kernel should be volk_signature_from_file_generic. If multiple +versions of the generic kernel exist then a description can be appended to +generic_, but it is not required to use alignment flags in the generic +protokernel name. It is required to surround the entire generic function +with preprocessor ifdef fences on the symbol LV_HAVE_GENERIC. + +Finally, add the kernel to the list of test cases in volk/lib/kernel_tests.h. +Many kernels should be able to use the default test parameters, but if yours +requires a lower tolerance, specific vector length, or other test parameters +just create a new instance of volk_test_params_t for your kernel. + +### Adding protokernels + +The primary purpose of VOLK is to have multiple implementations of an operation +tuned for a specific CPU architecture. Ideally there is at least one +protokernel of each kernel for every architecture that VOLK supports. +The pattern for protokernel naming is volk_kernel_signature_architecture_nick. +The architecture should be one of the supported VOLK architectures. The nick is +an optional name to distinguish between multiple implementations for a +particular architecture. + +Architecture specific protokernels can be written in one of three ways. +The first approach should always be to use compiler intrinsic functions. +The second and third approaches are using either in-line assembly or +assembly with .S files. Both methods of writing assembly exist in VOLK and +should yield equivalent performance; which method you might choose is a +matter of opinion. Regardless of the actual method the public function should +be declared in the kernel header surrounded by ifdef fences on the symbol that +fits the architecture implementation. + +#### Compiler Intrinsics + +Compiler intrinsics should be treated as functions that map to a specific +assembly instruction. Most VOLK kernels take the form of a loop that iterates +through a vector. Form a loop that iterates on a number of items that is natural +for the architecture and then use compiler intrinsics to do the math for your +operation or algorithm. Include the appropriate header inside the ifdef fences, +but before your protokernel declaration. + + +#### In-line Assembly + +In-line assembly uses a compiler macro to include verbatim assembly with C code. +The process of in-line assembly protokernels is very similar to protokernels +based on intrinsics. + +#### Assembly with .S files + +To write pure assembly protokernels, first declare the function name in the +kernel header file the same way as any other protokernel, but include the extern +keyword. Second, create a file (one for each protokernel) in +volk/kernels/volk/asm/$arch. Disassemble another protokernel and copy the +disassembled code in to this file to bootstrap a working implementation. Often +the disassembled code can be hand-tuned to improve performance. + +## VOLK Modules + +VOLK has a concept of modules. Each module is an independent VOLK tree. Modules +can be managed with the volk_modtool application. At a high level the module is +a clone of all of the VOLK machinery without kernels. volk_modtool also makes it +easy to copy kernels to a module. + +Kernels and protokernels are added to your own VOLK module the same way they are +added to this repository, which was described in the previous section. + +VOLK_GNSSSDR is a VOLK Module. + +*/ + diff --git a/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/kernels.dox b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/kernels.dox new file mode 100644 index 000000000..127e2d464 --- /dev/null +++ b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/kernels.dox @@ -0,0 +1,24 @@ +/*! \page kernels Kernels + +\li \subpage volk_gnsssdr_32fc_convert_16ic +\li \subpage volk_gnsssdr_32fc_convert_8ic +\li \subpage volk_gnsssdr_16ic_convert_32fc +\li \subpage volk_gnsssdr_16ic_resampler_16ic +\li \subpage volk_gnsssdr_16ic_xn_resampler_16ic_xn +\li \subpage volk_gnsssdr_16ic_s32fc_x2_rotator_16ic +\li \subpage volk_gnsssdr_16ic_x2_multiply_16ic +\li \subpage volk_gnsssdr_16ic_x2_dot_prod_16ic +\li \subpage volk_gnsssdr_16ic_x2_dot_prod_16ic_xn +\li \subpage volk_gnsssdr_16ic_x2_rotator_dot_prod_16ic_xn +\li \subpage volk_gnsssdr_8i_accumulator_s8i +\li \subpage volk_gnsssdr_8i_index_max_16u +\li \subpage volk_gnsssdr_8i_max_s8i +\li \subpage volk_gnsssdr_8i_x2_add_8i +\li \subpage volk_gnsssdr_8ic_conjugate_8ic +\li \subpage volk_gnsssdr_8ic_magnitude_squared_8i +\li \subpage volk_gnsssdr_8ic_x2_dot_prod_8ic +\li \subpage volk_gnsssdr_8ic_x2_multiply_8ic +\li \subpage volk_gnsssdr_8ic_s8ic_multiply_8ic +\li \subpage volk_gnsssdr_64f_accumulator_64f + +*/ diff --git a/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/main_page.dox b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/main_page.dox new file mode 100644 index 000000000..3d2409d22 --- /dev/null +++ b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/main_page.dox @@ -0,0 +1,19 @@ +/*! \mainpage VOLK_GNSSSDR + +Welcome to VOLK_GNSSSDR! + +VOLK_GNSSSDR is the Vector-Optimized Library of Kernels for GNSS-SDR. +It is a library that contains kernels of hand-written SIMD code for different +mathematical operations. Since each SIMD architecture can be very different +and no compiler has yet come along to handle vectorization properly or highly +efficiently, VOLK_GNSSSDR approaches the problem differently. + +For each architecture or platform that a developer wishes to vectorize for, a +new proto-kernel is added to VOLK_GNSSSDR. At runtime, VOLK_GNSSSDR will select the correct +proto-kernel. In this way, the users of VOLK_GNSSSDR call a kernel for performing the +operation that is platform/architecture agnostic. This allows us to write +portable SIMD code. + +VOLK_GNSSSDR is a module generated from the original VOLK library http://libvolk.org + +*/ diff --git a/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/terms_and_techniques.dox b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/terms_and_techniques.dox new file mode 100644 index 000000000..b7ec32bd0 --- /dev/null +++ b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/terms_and_techniques.dox @@ -0,0 +1,121 @@ +/*! \page concepts_terms_and_techniques Concepts, Terms, and Techniques + +This page is primarily a list of definitions and brief overview of successful +techniques previously used to develop VOLK_GNSSSDR protokernels. + +## Definitions and Concepts + +### SIMD + +SIMD stands for Single Instruction Multiple Data. Leveraging SIMD instructions +is the primary optimization in VOLK_GNSSSDR. + +### Architecture + +A VOLK_GNSSSDR architecture is normally called an Instruction Set Architecture (ISA). +The architectures we target in VOLK_GNSSSDR usually have SIMD instructions. + +### Vector + +A vector in VOLK_GNSSSDR is the same as a C array. It sometimes, but not always +coincides with the mathematical definition of a vector. + +### Kernel + +The 'kernel' part of the VOLK_GNSSSDR name comes from the high performance computing +use of the word. In this context it is the inner loop of a vector operation. +Since we don't use the word vector in the math sense a vector operation is an +operation that is performed on a C array. + +### Protokernel + +A protokernel is an implementation of a kernel. Every kernel has a 'generic' +protokernel that is implemented in C. Other protokernels are optimized for a +particular architecture. + + +## Techniques + +### New Kernels + +Add new kernels to the list in lib/kernel_tests.h. This adds the kernel to +VOLK_GNSSSDR's QA tool as well as the volk profiler. Many kernels are able to +share test parameters, but new kernels might need new ones. + +If the VOLK_GNSSSDR kernel does not 'fit' the the standard set of function parameters +expected by the volk_profile structure, you need to create a VOLK_GNSSSDR puppet +function to help the profiler call the kernel. This is essentially due to the +function run_volk_gnsssdr_tests which has a limited number of function prototypes that +it can test. + +### Protokernels + +Adding new proto-kernels (implementations of VOLK_GNSSSDR kernels for specific +architectures) is relatively easy. In the relevant .h file in +the volk_gnsssdr/include/volk_gnsssdr/volk_gnsssdr.h +file, add a new #ifdef/#endif block for the LV_HAVE_ corresponding +to the you a working on (e.g. SSE, AVX, NEON, etc.). + +For example, for volk_gnsssdr_16ic_x2_multiply_16ic_neon: + +\code +#ifdef LV_HAVE_NEON +#include + +static inline void volk_gnsssdr_16ic_x2_multiply_16ic_neon(lv_16sc_t* out, const lv_16sc_t* in_a, const lv_16sc_t* in_b, unsigned int num_points) +{ + lv_16sc_t *a_ptr = (lv_16sc_t*) in_a; + lv_16sc_t *b_ptr = (lv_16sc_t*) in_b; + unsigned int quarter_points = num_points / 4; + int16x4x2_t a_val, b_val, c_val; + int16x4x2_t tmp_real, tmp_imag; + unsigned int number = 0; + + for(; number < quarter_points; ++number) + { + a_val = vld2_s16((int16_t*)a_ptr); + b_val = vld2_s16((int16_t*)b_ptr); + + tmp_real.val[0] = vmul_s16(a_val.val[0], b_val.val[0]); + tmp_real.val[1] = vmul_s16(a_val.val[1], b_val.val[1]); + + tmp_imag.val[0] = vmul_s16(a_val.val[0], b_val.val[1]); + tmp_imag.val[1] = vmul_s16(a_val.val[1], b_val.val[0]); + + c_val.val[0] = vsub_s16(tmp_real.val[0], tmp_real.val[1]); + c_val.val[1] = vadd_s16(tmp_imag.val[0], tmp_imag.val[1]); + vst2_s16((int16_t*)out, c_val); + + a_ptr += 4; + b_ptr += 4; + out += 4; + } + + for(number = quarter_points * 4; number < num_points; number++) + { + *out++ = (*a_ptr++) * (*b_ptr++); + } +} +#endif /* LV_HAVE_NEON */ +\endcode + +### Allocating Memory + +SIMD code can be very sensitive to the alignment of the vectors, which is +generally something like a 16-byte or 32-byte alignment requirement. The +VOLK_GNSSSDR dispatcher functions, which is what we will normally call as users of +VOLK_GNSSSDR, makes sure that the correct aligned or unaligned version is called +depending on the state of the vectors passed to it. However, things typically +work faster and more efficiently when the vectors are aligned. As such, VOLK_GNSSSDR +has memory allocate and free methods to provide us with properly aligned +vectors. We can also ask VOLK_GNSSSDR to give us the current machine's alignment +requirement, which makes our job even easier when porting code. + +To get the machine's alignment, simply call the size_t volk_gnsssdr_get_alignment(). + +Allocate memory using void* volk_gnsssdr_malloc(size_t size, size_t alignment). + +Make sure that any memory allocated by VOLK_GNSSSDR is also freed by VOLK_GNSSSDR with volk_gnsssdr_free(void *p). + + +*/ diff --git a/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/using_volk_gnsssdr.dox b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/using_volk_gnsssdr.dox new file mode 100644 index 000000000..d7081f0e0 --- /dev/null +++ b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/docs/using_volk_gnsssdr.dox @@ -0,0 +1,19 @@ +/*! \page using_volk_gnsssdr Using VOLK_GNSSSDR + +Using VOLK_GNSSSDR in your code requires proper linking and including the correct headers. +VOLK_GNSSSDR currently supports both C and C++ bindings. + +VOLK_GNSSSDR provides both a pkgconfig and CMake module to help configuration and +linking. The pkfconfig file is installed to +$install_prefix/lib/pkgconfig/volk_gnsssdr.pc. The CMake configuration module is in +$install_prefix/lib/cmake/volk_gnsssdr/VolkConfig.cmake. + +The header in the VOLK_GNSSSDR include directory (includedir in pkgconfig, +VOLK_GNSSSDR_INCLUDE_DIRS in cmake module) contains the header volk_gnsssdr/volk_gnsssdr.h defines all +of the symbols exposed by VOLK_GNSSSDR. Alternatively individual kernel headers are in +the same location. + +In most cases it is sufficient to call the dispatcher for the kernel you are using. + +*/ + diff --git a/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/kernels/volk_gnsssdr/volk_gnsssdr_16ic_convert_32fc.h b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/kernels/volk_gnsssdr/volk_gnsssdr_16ic_convert_32fc.h index 9a8400f7f..efa8d61fb 100644 --- a/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/kernels/volk_gnsssdr/volk_gnsssdr_16ic_convert_32fc.h +++ b/src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/kernels/volk_gnsssdr/volk_gnsssdr_16ic_convert_32fc.h @@ -1,6 +1,6 @@ /*! - * \file volk_gnsssdr_32fc_convert_16ic.h - * \brief Volk protokernel: converts 16 bit integer complex complex values to 32 bits float complex values + * \file volk_gnsssdr_16ic_convert_32fc.h + * \brief Volk protokernel: converts 16 bit integer complex complex values to 32 bits float complex values * \authors
    *
  • Javier Arribas, 2015. jarribas(at)cttc.es *