1
0
mirror of https://github.com/gnss-sdr/gnss-sdr synced 2024-12-14 20:20:35 +00:00

Adding documentation

Copied from VOLK, with some minor changes
This commit is contained in:
Carles Fernandez 2016-03-09 21:01:22 +01:00
parent 81f4eadb5b
commit f6e713929a
6 changed files with 277 additions and 2 deletions

View File

@ -0,0 +1,92 @@
/*! \page extending_volk Extending VOLK
There are two primary routes for extending VOLK for your own use. The
preferred route is by writing kernels and proto-kernels as part of this
repository and sending patches upstream. The alternative is creating your
own VOLK module, as it is the case of VOLK_GNSSSDR ;-). There is a good reason
for that: to provide GNSS-SDR users with adequate protokernels as soon as possible,
without needing to upgrade to the latest VOLK version to enjoy the benefits.
Notwithstanding, some of VOLK_GNSSSDR can be integrated into VOLK in the future,
if other users find them useful.
## Modifying this repository
### Adding kernels
Adding kernels refers to introducing a new function to the VOLK API that is
presumably a useful math function/operation. The first step is to create
the file in volk/kernels/volk. Follow the naming scheme provided in the
VOLK terms and techniques page. First create the generic protokernel.
The generic protokernel should be written in plain C using explicitly sized
types from stdint.h or volk_complex.h when appropriate. volk_complex.h
includes explicitly sized complex types for floats and ints. The name of
the generic kernel should be volk_signature_from_file_generic. If multiple
versions of the generic kernel exist then a description can be appended to
generic_, but it is not required to use alignment flags in the generic
protokernel name. It is required to surround the entire generic function
with preprocessor ifdef fences on the symbol LV_HAVE_GENERIC.
Finally, add the kernel to the list of test cases in volk/lib/kernel_tests.h.
Many kernels should be able to use the default test parameters, but if yours
requires a lower tolerance, specific vector length, or other test parameters
just create a new instance of volk_test_params_t for your kernel.
### Adding protokernels
The primary purpose of VOLK is to have multiple implementations of an operation
tuned for a specific CPU architecture. Ideally there is at least one
protokernel of each kernel for every architecture that VOLK supports.
The pattern for protokernel naming is volk_kernel_signature_architecture_nick.
The architecture should be one of the supported VOLK architectures. The nick is
an optional name to distinguish between multiple implementations for a
particular architecture.
Architecture specific protokernels can be written in one of three ways.
The first approach should always be to use compiler intrinsic functions.
The second and third approaches are using either in-line assembly or
assembly with .S files. Both methods of writing assembly exist in VOLK and
should yield equivalent performance; which method you might choose is a
matter of opinion. Regardless of the actual method the public function should
be declared in the kernel header surrounded by ifdef fences on the symbol that
fits the architecture implementation.
#### Compiler Intrinsics
Compiler intrinsics should be treated as functions that map to a specific
assembly instruction. Most VOLK kernels take the form of a loop that iterates
through a vector. Form a loop that iterates on a number of items that is natural
for the architecture and then use compiler intrinsics to do the math for your
operation or algorithm. Include the appropriate header inside the ifdef fences,
but before your protokernel declaration.
#### In-line Assembly
In-line assembly uses a compiler macro to include verbatim assembly with C code.
The process of in-line assembly protokernels is very similar to protokernels
based on intrinsics.
#### Assembly with .S files
To write pure assembly protokernels, first declare the function name in the
kernel header file the same way as any other protokernel, but include the extern
keyword. Second, create a file (one for each protokernel) in
volk/kernels/volk/asm/$arch. Disassemble another protokernel and copy the
disassembled code in to this file to bootstrap a working implementation. Often
the disassembled code can be hand-tuned to improve performance.
## VOLK Modules
VOLK has a concept of modules. Each module is an independent VOLK tree. Modules
can be managed with the volk_modtool application. At a high level the module is
a clone of all of the VOLK machinery without kernels. volk_modtool also makes it
easy to copy kernels to a module.
Kernels and protokernels are added to your own VOLK module the same way they are
added to this repository, which was described in the previous section.
VOLK_GNSSSDR is a VOLK Module.
*/

View File

@ -0,0 +1,24 @@
/*! \page kernels Kernels
\li \subpage volk_gnsssdr_32fc_convert_16ic
\li \subpage volk_gnsssdr_32fc_convert_8ic
\li \subpage volk_gnsssdr_16ic_convert_32fc
\li \subpage volk_gnsssdr_16ic_resampler_16ic
\li \subpage volk_gnsssdr_16ic_xn_resampler_16ic_xn
\li \subpage volk_gnsssdr_16ic_s32fc_x2_rotator_16ic
\li \subpage volk_gnsssdr_16ic_x2_multiply_16ic
\li \subpage volk_gnsssdr_16ic_x2_dot_prod_16ic
\li \subpage volk_gnsssdr_16ic_x2_dot_prod_16ic_xn
\li \subpage volk_gnsssdr_16ic_x2_rotator_dot_prod_16ic_xn
\li \subpage volk_gnsssdr_8i_accumulator_s8i
\li \subpage volk_gnsssdr_8i_index_max_16u
\li \subpage volk_gnsssdr_8i_max_s8i
\li \subpage volk_gnsssdr_8i_x2_add_8i
\li \subpage volk_gnsssdr_8ic_conjugate_8ic
\li \subpage volk_gnsssdr_8ic_magnitude_squared_8i
\li \subpage volk_gnsssdr_8ic_x2_dot_prod_8ic
\li \subpage volk_gnsssdr_8ic_x2_multiply_8ic
\li \subpage volk_gnsssdr_8ic_s8ic_multiply_8ic
\li \subpage volk_gnsssdr_64f_accumulator_64f
*/

View File

@ -0,0 +1,19 @@
/*! \mainpage VOLK_GNSSSDR
Welcome to VOLK_GNSSSDR!
VOLK_GNSSSDR is the Vector-Optimized Library of Kernels for GNSS-SDR.
It is a library that contains kernels of hand-written SIMD code for different
mathematical operations. Since each SIMD architecture can be very different
and no compiler has yet come along to handle vectorization properly or highly
efficiently, VOLK_GNSSSDR approaches the problem differently.
For each architecture or platform that a developer wishes to vectorize for, a
new proto-kernel is added to VOLK_GNSSSDR. At runtime, VOLK_GNSSSDR will select the correct
proto-kernel. In this way, the users of VOLK_GNSSSDR call a kernel for performing the
operation that is platform/architecture agnostic. This allows us to write
portable SIMD code.
VOLK_GNSSSDR is a module generated from the original VOLK library http://libvolk.org
*/

View File

@ -0,0 +1,121 @@
/*! \page concepts_terms_and_techniques Concepts, Terms, and Techniques
This page is primarily a list of definitions and brief overview of successful
techniques previously used to develop VOLK_GNSSSDR protokernels.
## Definitions and Concepts
### SIMD
SIMD stands for Single Instruction Multiple Data. Leveraging SIMD instructions
is the primary optimization in VOLK_GNSSSDR.
### Architecture
A VOLK_GNSSSDR architecture is normally called an Instruction Set Architecture (ISA).
The architectures we target in VOLK_GNSSSDR usually have SIMD instructions.
### Vector
A vector in VOLK_GNSSSDR is the same as a C array. It sometimes, but not always
coincides with the mathematical definition of a vector.
### Kernel
The 'kernel' part of the VOLK_GNSSSDR name comes from the high performance computing
use of the word. In this context it is the inner loop of a vector operation.
Since we don't use the word vector in the math sense a vector operation is an
operation that is performed on a C array.
### Protokernel
A protokernel is an implementation of a kernel. Every kernel has a 'generic'
protokernel that is implemented in C. Other protokernels are optimized for a
particular architecture.
## Techniques
### New Kernels
Add new kernels to the list in lib/kernel_tests.h. This adds the kernel to
VOLK_GNSSSDR's QA tool as well as the volk profiler. Many kernels are able to
share test parameters, but new kernels might need new ones.
If the VOLK_GNSSSDR kernel does not 'fit' the the standard set of function parameters
expected by the volk_profile structure, you need to create a VOLK_GNSSSDR puppet
function to help the profiler call the kernel. This is essentially due to the
function run_volk_gnsssdr_tests which has a limited number of function prototypes that
it can test.
### Protokernels
Adding new proto-kernels (implementations of VOLK_GNSSSDR kernels for specific
architectures) is relatively easy. In the relevant <kernel>.h file in
the volk_gnsssdr/include/volk_gnsssdr/volk_gnsssdr<input-fingerprint_function-name_output-fingerprint>.h
file, add a new #ifdef/#endif block for the LV_HAVE_<arch> corresponding
to the <arch> you a working on (e.g. SSE, AVX, NEON, etc.).
For example, for volk_gnsssdr_16ic_x2_multiply_16ic_neon:
\code
#ifdef LV_HAVE_NEON
#include <arm_neon.h>
static inline void volk_gnsssdr_16ic_x2_multiply_16ic_neon(lv_16sc_t* out, const lv_16sc_t* in_a, const lv_16sc_t* in_b, unsigned int num_points)
{
lv_16sc_t *a_ptr = (lv_16sc_t*) in_a;
lv_16sc_t *b_ptr = (lv_16sc_t*) in_b;
unsigned int quarter_points = num_points / 4;
int16x4x2_t a_val, b_val, c_val;
int16x4x2_t tmp_real, tmp_imag;
unsigned int number = 0;
for(; number < quarter_points; ++number)
{
a_val = vld2_s16((int16_t*)a_ptr);
b_val = vld2_s16((int16_t*)b_ptr);
tmp_real.val[0] = vmul_s16(a_val.val[0], b_val.val[0]);
tmp_real.val[1] = vmul_s16(a_val.val[1], b_val.val[1]);
tmp_imag.val[0] = vmul_s16(a_val.val[0], b_val.val[1]);
tmp_imag.val[1] = vmul_s16(a_val.val[1], b_val.val[0]);
c_val.val[0] = vsub_s16(tmp_real.val[0], tmp_real.val[1]);
c_val.val[1] = vadd_s16(tmp_imag.val[0], tmp_imag.val[1]);
vst2_s16((int16_t*)out, c_val);
a_ptr += 4;
b_ptr += 4;
out += 4;
}
for(number = quarter_points * 4; number < num_points; number++)
{
*out++ = (*a_ptr++) * (*b_ptr++);
}
}
#endif /* LV_HAVE_NEON */
\endcode
### Allocating Memory
SIMD code can be very sensitive to the alignment of the vectors, which is
generally something like a 16-byte or 32-byte alignment requirement. The
VOLK_GNSSSDR dispatcher functions, which is what we will normally call as users of
VOLK_GNSSSDR, makes sure that the correct aligned or unaligned version is called
depending on the state of the vectors passed to it. However, things typically
work faster and more efficiently when the vectors are aligned. As such, VOLK_GNSSSDR
has memory allocate and free methods to provide us with properly aligned
vectors. We can also ask VOLK_GNSSSDR to give us the current machine's alignment
requirement, which makes our job even easier when porting code.
To get the machine's alignment, simply call the size_t volk_gnsssdr_get_alignment().
Allocate memory using void* volk_gnsssdr_malloc(size_t size, size_t alignment).
Make sure that any memory allocated by VOLK_GNSSSDR is also freed by VOLK_GNSSSDR with volk_gnsssdr_free(void *p).
*/

View File

@ -0,0 +1,19 @@
/*! \page using_volk_gnsssdr Using VOLK_GNSSSDR
Using VOLK_GNSSSDR in your code requires proper linking and including the correct headers.
VOLK_GNSSSDR currently supports both C and C++ bindings.
VOLK_GNSSSDR provides both a pkgconfig and CMake module to help configuration and
linking. The pkfconfig file is installed to
$install_prefix/lib/pkgconfig/volk_gnsssdr.pc. The CMake configuration module is in
$install_prefix/lib/cmake/volk_gnsssdr/VolkConfig.cmake.
The header in the VOLK_GNSSSDR include directory (includedir in pkgconfig,
VOLK_GNSSSDR_INCLUDE_DIRS in cmake module) contains the header volk_gnsssdr/volk_gnsssdr.h defines all
of the symbols exposed by VOLK_GNSSSDR. Alternatively individual kernel headers are in
the same location.
In most cases it is sufficient to call the dispatcher for the kernel you are using.
*/

View File

@ -1,6 +1,6 @@
/*!
* \file volk_gnsssdr_32fc_convert_16ic.h
* \brief Volk protokernel: converts 16 bit integer complex complex values to 32 bits float complex values
* \file volk_gnsssdr_16ic_convert_32fc.h
* \brief Volk protokernel: converts 16 bit integer complex complex values to 32 bits float complex values
* \authors <ul>
* <li> Javier Arribas, 2015. jarribas(at)cttc.es
* </ul>