mirror of
https://github.com/gnss-sdr/gnss-sdr
synced 2025-01-18 21:23:02 +00:00
Adding documentation
Copied from VOLK, with some minor changes
This commit is contained in:
parent
81f4eadb5b
commit
f6e713929a
@ -0,0 +1,92 @@
|
|||||||
|
/*! \page extending_volk Extending VOLK
|
||||||
|
|
||||||
|
There are two primary routes for extending VOLK for your own use. The
|
||||||
|
preferred route is by writing kernels and proto-kernels as part of this
|
||||||
|
repository and sending patches upstream. The alternative is creating your
|
||||||
|
own VOLK module, as it is the case of VOLK_GNSSSDR ;-). There is a good reason
|
||||||
|
for that: to provide GNSS-SDR users with adequate protokernels as soon as possible,
|
||||||
|
without needing to upgrade to the latest VOLK version to enjoy the benefits.
|
||||||
|
Notwithstanding, some of VOLK_GNSSSDR can be integrated into VOLK in the future,
|
||||||
|
if other users find them useful.
|
||||||
|
|
||||||
|
## Modifying this repository
|
||||||
|
|
||||||
|
### Adding kernels
|
||||||
|
|
||||||
|
Adding kernels refers to introducing a new function to the VOLK API that is
|
||||||
|
presumably a useful math function/operation. The first step is to create
|
||||||
|
the file in volk/kernels/volk. Follow the naming scheme provided in the
|
||||||
|
VOLK terms and techniques page. First create the generic protokernel.
|
||||||
|
|
||||||
|
The generic protokernel should be written in plain C using explicitly sized
|
||||||
|
types from stdint.h or volk_complex.h when appropriate. volk_complex.h
|
||||||
|
includes explicitly sized complex types for floats and ints. The name of
|
||||||
|
the generic kernel should be volk_signature_from_file_generic. If multiple
|
||||||
|
versions of the generic kernel exist then a description can be appended to
|
||||||
|
generic_, but it is not required to use alignment flags in the generic
|
||||||
|
protokernel name. It is required to surround the entire generic function
|
||||||
|
with preprocessor ifdef fences on the symbol LV_HAVE_GENERIC.
|
||||||
|
|
||||||
|
Finally, add the kernel to the list of test cases in volk/lib/kernel_tests.h.
|
||||||
|
Many kernels should be able to use the default test parameters, but if yours
|
||||||
|
requires a lower tolerance, specific vector length, or other test parameters
|
||||||
|
just create a new instance of volk_test_params_t for your kernel.
|
||||||
|
|
||||||
|
### Adding protokernels
|
||||||
|
|
||||||
|
The primary purpose of VOLK is to have multiple implementations of an operation
|
||||||
|
tuned for a specific CPU architecture. Ideally there is at least one
|
||||||
|
protokernel of each kernel for every architecture that VOLK supports.
|
||||||
|
The pattern for protokernel naming is volk_kernel_signature_architecture_nick.
|
||||||
|
The architecture should be one of the supported VOLK architectures. The nick is
|
||||||
|
an optional name to distinguish between multiple implementations for a
|
||||||
|
particular architecture.
|
||||||
|
|
||||||
|
Architecture specific protokernels can be written in one of three ways.
|
||||||
|
The first approach should always be to use compiler intrinsic functions.
|
||||||
|
The second and third approaches are using either in-line assembly or
|
||||||
|
assembly with .S files. Both methods of writing assembly exist in VOLK and
|
||||||
|
should yield equivalent performance; which method you might choose is a
|
||||||
|
matter of opinion. Regardless of the actual method the public function should
|
||||||
|
be declared in the kernel header surrounded by ifdef fences on the symbol that
|
||||||
|
fits the architecture implementation.
|
||||||
|
|
||||||
|
#### Compiler Intrinsics
|
||||||
|
|
||||||
|
Compiler intrinsics should be treated as functions that map to a specific
|
||||||
|
assembly instruction. Most VOLK kernels take the form of a loop that iterates
|
||||||
|
through a vector. Form a loop that iterates on a number of items that is natural
|
||||||
|
for the architecture and then use compiler intrinsics to do the math for your
|
||||||
|
operation or algorithm. Include the appropriate header inside the ifdef fences,
|
||||||
|
but before your protokernel declaration.
|
||||||
|
|
||||||
|
|
||||||
|
#### In-line Assembly
|
||||||
|
|
||||||
|
In-line assembly uses a compiler macro to include verbatim assembly with C code.
|
||||||
|
The process of in-line assembly protokernels is very similar to protokernels
|
||||||
|
based on intrinsics.
|
||||||
|
|
||||||
|
#### Assembly with .S files
|
||||||
|
|
||||||
|
To write pure assembly protokernels, first declare the function name in the
|
||||||
|
kernel header file the same way as any other protokernel, but include the extern
|
||||||
|
keyword. Second, create a file (one for each protokernel) in
|
||||||
|
volk/kernels/volk/asm/$arch. Disassemble another protokernel and copy the
|
||||||
|
disassembled code in to this file to bootstrap a working implementation. Often
|
||||||
|
the disassembled code can be hand-tuned to improve performance.
|
||||||
|
|
||||||
|
## VOLK Modules
|
||||||
|
|
||||||
|
VOLK has a concept of modules. Each module is an independent VOLK tree. Modules
|
||||||
|
can be managed with the volk_modtool application. At a high level the module is
|
||||||
|
a clone of all of the VOLK machinery without kernels. volk_modtool also makes it
|
||||||
|
easy to copy kernels to a module.
|
||||||
|
|
||||||
|
Kernels and protokernels are added to your own VOLK module the same way they are
|
||||||
|
added to this repository, which was described in the previous section.
|
||||||
|
|
||||||
|
VOLK_GNSSSDR is a VOLK Module.
|
||||||
|
|
||||||
|
*/
|
||||||
|
|
@ -0,0 +1,24 @@
|
|||||||
|
/*! \page kernels Kernels
|
||||||
|
|
||||||
|
\li \subpage volk_gnsssdr_32fc_convert_16ic
|
||||||
|
\li \subpage volk_gnsssdr_32fc_convert_8ic
|
||||||
|
\li \subpage volk_gnsssdr_16ic_convert_32fc
|
||||||
|
\li \subpage volk_gnsssdr_16ic_resampler_16ic
|
||||||
|
\li \subpage volk_gnsssdr_16ic_xn_resampler_16ic_xn
|
||||||
|
\li \subpage volk_gnsssdr_16ic_s32fc_x2_rotator_16ic
|
||||||
|
\li \subpage volk_gnsssdr_16ic_x2_multiply_16ic
|
||||||
|
\li \subpage volk_gnsssdr_16ic_x2_dot_prod_16ic
|
||||||
|
\li \subpage volk_gnsssdr_16ic_x2_dot_prod_16ic_xn
|
||||||
|
\li \subpage volk_gnsssdr_16ic_x2_rotator_dot_prod_16ic_xn
|
||||||
|
\li \subpage volk_gnsssdr_8i_accumulator_s8i
|
||||||
|
\li \subpage volk_gnsssdr_8i_index_max_16u
|
||||||
|
\li \subpage volk_gnsssdr_8i_max_s8i
|
||||||
|
\li \subpage volk_gnsssdr_8i_x2_add_8i
|
||||||
|
\li \subpage volk_gnsssdr_8ic_conjugate_8ic
|
||||||
|
\li \subpage volk_gnsssdr_8ic_magnitude_squared_8i
|
||||||
|
\li \subpage volk_gnsssdr_8ic_x2_dot_prod_8ic
|
||||||
|
\li \subpage volk_gnsssdr_8ic_x2_multiply_8ic
|
||||||
|
\li \subpage volk_gnsssdr_8ic_s8ic_multiply_8ic
|
||||||
|
\li \subpage volk_gnsssdr_64f_accumulator_64f
|
||||||
|
|
||||||
|
*/
|
@ -0,0 +1,19 @@
|
|||||||
|
/*! \mainpage VOLK_GNSSSDR
|
||||||
|
|
||||||
|
Welcome to VOLK_GNSSSDR!
|
||||||
|
|
||||||
|
VOLK_GNSSSDR is the Vector-Optimized Library of Kernels for GNSS-SDR.
|
||||||
|
It is a library that contains kernels of hand-written SIMD code for different
|
||||||
|
mathematical operations. Since each SIMD architecture can be very different
|
||||||
|
and no compiler has yet come along to handle vectorization properly or highly
|
||||||
|
efficiently, VOLK_GNSSSDR approaches the problem differently.
|
||||||
|
|
||||||
|
For each architecture or platform that a developer wishes to vectorize for, a
|
||||||
|
new proto-kernel is added to VOLK_GNSSSDR. At runtime, VOLK_GNSSSDR will select the correct
|
||||||
|
proto-kernel. In this way, the users of VOLK_GNSSSDR call a kernel for performing the
|
||||||
|
operation that is platform/architecture agnostic. This allows us to write
|
||||||
|
portable SIMD code.
|
||||||
|
|
||||||
|
VOLK_GNSSSDR is a module generated from the original VOLK library http://libvolk.org
|
||||||
|
|
||||||
|
*/
|
@ -0,0 +1,121 @@
|
|||||||
|
/*! \page concepts_terms_and_techniques Concepts, Terms, and Techniques
|
||||||
|
|
||||||
|
This page is primarily a list of definitions and brief overview of successful
|
||||||
|
techniques previously used to develop VOLK_GNSSSDR protokernels.
|
||||||
|
|
||||||
|
## Definitions and Concepts
|
||||||
|
|
||||||
|
### SIMD
|
||||||
|
|
||||||
|
SIMD stands for Single Instruction Multiple Data. Leveraging SIMD instructions
|
||||||
|
is the primary optimization in VOLK_GNSSSDR.
|
||||||
|
|
||||||
|
### Architecture
|
||||||
|
|
||||||
|
A VOLK_GNSSSDR architecture is normally called an Instruction Set Architecture (ISA).
|
||||||
|
The architectures we target in VOLK_GNSSSDR usually have SIMD instructions.
|
||||||
|
|
||||||
|
### Vector
|
||||||
|
|
||||||
|
A vector in VOLK_GNSSSDR is the same as a C array. It sometimes, but not always
|
||||||
|
coincides with the mathematical definition of a vector.
|
||||||
|
|
||||||
|
### Kernel
|
||||||
|
|
||||||
|
The 'kernel' part of the VOLK_GNSSSDR name comes from the high performance computing
|
||||||
|
use of the word. In this context it is the inner loop of a vector operation.
|
||||||
|
Since we don't use the word vector in the math sense a vector operation is an
|
||||||
|
operation that is performed on a C array.
|
||||||
|
|
||||||
|
### Protokernel
|
||||||
|
|
||||||
|
A protokernel is an implementation of a kernel. Every kernel has a 'generic'
|
||||||
|
protokernel that is implemented in C. Other protokernels are optimized for a
|
||||||
|
particular architecture.
|
||||||
|
|
||||||
|
|
||||||
|
## Techniques
|
||||||
|
|
||||||
|
### New Kernels
|
||||||
|
|
||||||
|
Add new kernels to the list in lib/kernel_tests.h. This adds the kernel to
|
||||||
|
VOLK_GNSSSDR's QA tool as well as the volk profiler. Many kernels are able to
|
||||||
|
share test parameters, but new kernels might need new ones.
|
||||||
|
|
||||||
|
If the VOLK_GNSSSDR kernel does not 'fit' the the standard set of function parameters
|
||||||
|
expected by the volk_profile structure, you need to create a VOLK_GNSSSDR puppet
|
||||||
|
function to help the profiler call the kernel. This is essentially due to the
|
||||||
|
function run_volk_gnsssdr_tests which has a limited number of function prototypes that
|
||||||
|
it can test.
|
||||||
|
|
||||||
|
### Protokernels
|
||||||
|
|
||||||
|
Adding new proto-kernels (implementations of VOLK_GNSSSDR kernels for specific
|
||||||
|
architectures) is relatively easy. In the relevant <kernel>.h file in
|
||||||
|
the volk_gnsssdr/include/volk_gnsssdr/volk_gnsssdr<input-fingerprint_function-name_output-fingerprint>.h
|
||||||
|
file, add a new #ifdef/#endif block for the LV_HAVE_<arch> corresponding
|
||||||
|
to the <arch> you a working on (e.g. SSE, AVX, NEON, etc.).
|
||||||
|
|
||||||
|
For example, for volk_gnsssdr_16ic_x2_multiply_16ic_neon:
|
||||||
|
|
||||||
|
\code
|
||||||
|
#ifdef LV_HAVE_NEON
|
||||||
|
#include <arm_neon.h>
|
||||||
|
|
||||||
|
static inline void volk_gnsssdr_16ic_x2_multiply_16ic_neon(lv_16sc_t* out, const lv_16sc_t* in_a, const lv_16sc_t* in_b, unsigned int num_points)
|
||||||
|
{
|
||||||
|
lv_16sc_t *a_ptr = (lv_16sc_t*) in_a;
|
||||||
|
lv_16sc_t *b_ptr = (lv_16sc_t*) in_b;
|
||||||
|
unsigned int quarter_points = num_points / 4;
|
||||||
|
int16x4x2_t a_val, b_val, c_val;
|
||||||
|
int16x4x2_t tmp_real, tmp_imag;
|
||||||
|
unsigned int number = 0;
|
||||||
|
|
||||||
|
for(; number < quarter_points; ++number)
|
||||||
|
{
|
||||||
|
a_val = vld2_s16((int16_t*)a_ptr);
|
||||||
|
b_val = vld2_s16((int16_t*)b_ptr);
|
||||||
|
|
||||||
|
tmp_real.val[0] = vmul_s16(a_val.val[0], b_val.val[0]);
|
||||||
|
tmp_real.val[1] = vmul_s16(a_val.val[1], b_val.val[1]);
|
||||||
|
|
||||||
|
tmp_imag.val[0] = vmul_s16(a_val.val[0], b_val.val[1]);
|
||||||
|
tmp_imag.val[1] = vmul_s16(a_val.val[1], b_val.val[0]);
|
||||||
|
|
||||||
|
c_val.val[0] = vsub_s16(tmp_real.val[0], tmp_real.val[1]);
|
||||||
|
c_val.val[1] = vadd_s16(tmp_imag.val[0], tmp_imag.val[1]);
|
||||||
|
vst2_s16((int16_t*)out, c_val);
|
||||||
|
|
||||||
|
a_ptr += 4;
|
||||||
|
b_ptr += 4;
|
||||||
|
out += 4;
|
||||||
|
}
|
||||||
|
|
||||||
|
for(number = quarter_points * 4; number < num_points; number++)
|
||||||
|
{
|
||||||
|
*out++ = (*a_ptr++) * (*b_ptr++);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
#endif /* LV_HAVE_NEON */
|
||||||
|
\endcode
|
||||||
|
|
||||||
|
### Allocating Memory
|
||||||
|
|
||||||
|
SIMD code can be very sensitive to the alignment of the vectors, which is
|
||||||
|
generally something like a 16-byte or 32-byte alignment requirement. The
|
||||||
|
VOLK_GNSSSDR dispatcher functions, which is what we will normally call as users of
|
||||||
|
VOLK_GNSSSDR, makes sure that the correct aligned or unaligned version is called
|
||||||
|
depending on the state of the vectors passed to it. However, things typically
|
||||||
|
work faster and more efficiently when the vectors are aligned. As such, VOLK_GNSSSDR
|
||||||
|
has memory allocate and free methods to provide us with properly aligned
|
||||||
|
vectors. We can also ask VOLK_GNSSSDR to give us the current machine's alignment
|
||||||
|
requirement, which makes our job even easier when porting code.
|
||||||
|
|
||||||
|
To get the machine's alignment, simply call the size_t volk_gnsssdr_get_alignment().
|
||||||
|
|
||||||
|
Allocate memory using void* volk_gnsssdr_malloc(size_t size, size_t alignment).
|
||||||
|
|
||||||
|
Make sure that any memory allocated by VOLK_GNSSSDR is also freed by VOLK_GNSSSDR with volk_gnsssdr_free(void *p).
|
||||||
|
|
||||||
|
|
||||||
|
*/
|
@ -0,0 +1,19 @@
|
|||||||
|
/*! \page using_volk_gnsssdr Using VOLK_GNSSSDR
|
||||||
|
|
||||||
|
Using VOLK_GNSSSDR in your code requires proper linking and including the correct headers.
|
||||||
|
VOLK_GNSSSDR currently supports both C and C++ bindings.
|
||||||
|
|
||||||
|
VOLK_GNSSSDR provides both a pkgconfig and CMake module to help configuration and
|
||||||
|
linking. The pkfconfig file is installed to
|
||||||
|
$install_prefix/lib/pkgconfig/volk_gnsssdr.pc. The CMake configuration module is in
|
||||||
|
$install_prefix/lib/cmake/volk_gnsssdr/VolkConfig.cmake.
|
||||||
|
|
||||||
|
The header in the VOLK_GNSSSDR include directory (includedir in pkgconfig,
|
||||||
|
VOLK_GNSSSDR_INCLUDE_DIRS in cmake module) contains the header volk_gnsssdr/volk_gnsssdr.h defines all
|
||||||
|
of the symbols exposed by VOLK_GNSSSDR. Alternatively individual kernel headers are in
|
||||||
|
the same location.
|
||||||
|
|
||||||
|
In most cases it is sufficient to call the dispatcher for the kernel you are using.
|
||||||
|
|
||||||
|
*/
|
||||||
|
|
@ -1,6 +1,6 @@
|
|||||||
/*!
|
/*!
|
||||||
* \file volk_gnsssdr_32fc_convert_16ic.h
|
* \file volk_gnsssdr_16ic_convert_32fc.h
|
||||||
* \brief Volk protokernel: converts 16 bit integer complex complex values to 32 bits float complex values
|
* \brief Volk protokernel: converts 16 bit integer complex complex values to 32 bits float complex values
|
||||||
* \authors <ul>
|
* \authors <ul>
|
||||||
* <li> Javier Arribas, 2015. jarribas(at)cttc.es
|
* <li> Javier Arribas, 2015. jarribas(at)cttc.es
|
||||||
* </ul>
|
* </ul>
|
||||||
|
Loading…
Reference in New Issue
Block a user