1
0
mirror of https://github.com/gnss-sdr/gnss-sdr synced 2025-01-31 03:14:56 +00:00
Commit Graph

290 Commits

Author SHA1 Message Date
Carles Fernandez
57107cf86d Introducing a new resampler kernel for comparison 2016-04-01 12:41:00 +02:00
Carles Fernandez
7658f64527 adding unaligned protokernels 2016-04-01 10:36:52 +02:00
Carles Fernandez
5841258a36 bug fix 2016-04-01 01:51:58 +02:00
Carles Fernandez
99ceb30a0e Fixes related to the MSVC compiler 2016-04-01 01:50:09 +02:00
Carles Fernandez
684073bef6 adding NEON puppet 2016-04-01 01:47:51 +02:00
Carles Fernandez
43330588a8 adding include for neon intrinsics 2016-04-01 01:25:57 +02:00
Carles Fernandez
f6cfc64cf7 adding NEON protokernel 2016-04-01 01:21:35 +02:00
Carles Fernandez
a71b118170 Workaround for problems with orc 2016-03-31 21:35:05 +02:00
Carles Fernandez
817139ba50 ask for aligned memory in a more portable way 2016-03-31 19:39:37 +02:00
Carles Fernandez
36660e05ca Removing unused class (nco_lib, replaced by volk kernels) 2016-03-30 22:37:43 +02:00
Carles Fernandez
4f3273f296 code cleaning, removing tabs 2016-03-30 22:27:12 +02:00
Carles Fernandez
9eb175fb0e Adding new resampler kernel and integrating it in the multicorrelator 2016-03-30 21:33:43 +02:00
Carles Fernandez
d8b45d9b79 fix wrong storeu by store 2016-03-28 13:51:51 +02:00
Carles Fernandez
78372ba2e9 adding _mm256_zeroupper() at the end of AVX and AVX2 protokernels
This avoids penalties for state transitions from 256-bit x86-AVX
instructions to x86-SSE instructions
2016-03-28 11:58:01 +02:00
Carles Fernandez
b1d99d58ec fix typo in puppet initialization
The AVX2 protokernel achieves an acceleration factor x13.
2016-03-28 10:14:32 +02:00
Carles Fernandez
26e68e89f2 adding AVX2 protokernels (aligned and unaligned) 2016-03-28 09:42:55 +02:00
Carles Fernandez
7c1f5723e6 remove unneeded stores in NEON protokernels 2016-03-27 13:00:04 +02:00
Carles Fernandez
d113835073 adding new kernel: volk_gnsssdr_32fc_x2_rotator_dot_prod_32fc_xn
Including generic, SSE3 (aligned and unaligned), AVX (aligned and
unaligned) and NEON protokernels.
2016-03-27 12:50:53 +02:00
Carles Fernandez
751764343c adding AVX2 protokerns
I haven't found a way to do the rotator part better than with SSE3. Only
the dot product takes real advantage of 256-bit registers. Even tough,
the gain with respect to SSE3 is about 12%.
2016-03-26 01:51:01 +01:00
Carles Fernandez
d987a04d42 adding AVX2 protokernels 2016-03-22 18:03:34 +01:00
Carles Fernandez
bd6c028ec4 bug fix
writing to the input pointer was having bad consequences (random fails
in other kernels)
2016-03-22 18:00:56 +01:00
Javier Arribas
0e47d97dec Adding a missing include in gnsssdr volk kernel library (volk_gnsssdr_16ic_x2_rotator_dot_prod_16ic_xn) 2016-03-21 16:07:16 +01:00
Carles Fernandez
703de227a2 fix typo 2016-03-21 00:44:09 +01:00
Carles Fernandez
a908804b44 fix phase computation in the tail items of the NEON protokernel 2016-03-21 00:40:36 +01:00
Carles Fernandez
1983562496 The sincos kernel now accepts an initial phase 2016-03-21 00:38:08 +01:00
Carles Fernandez
485a405bab Adding new neon kernel and solving x86 issues
Managing memory with volk_gnsssdr instead of malloc and free. This seems
to solve runtime problems (segmentation faults) in i386 (32 bit)
architectures.
2016-03-20 13:11:53 +01:00
Carles Fernandez
883cf629d1 Adding new NEON protokernel
Try another strategy based on multiply-and-accumulate for the dot
product. In all SIMD protokernels, managing memory with
volk_gnsssdr_malloc and volk_gnsssdr_free instead of calloc and free
2016-03-20 12:23:45 +01:00
Carles Fernandez
fa292961c1 Fix neon protokernel 2016-03-20 01:50:04 +01:00
Carles Fernandez
9c8fc9436e Adding and integrating sincos kernel 2016-03-20 01:45:01 +01:00
Carles Fernandez
2be266cc71 adding sincos kernel 2016-03-19 21:41:19 +01:00
Carles Fernandez
c9ff9759cc Fixing some numerical problems 2016-03-18 19:46:18 +01:00
Carles Fernandez
2cf1ea85af fix app name in help message 2016-03-13 12:07:40 +01:00
Carles Fernandez
87dc56e147 Using vector multiply-accumulate in NEON kernels
Aprox 10% of improvememnt
2016-03-12 21:47:35 +01:00
Carles Fernandez
f7c1c9ce43 Using multiply-accumulate in NEON 2016-03-12 19:30:00 +01:00
Carles Fernandez
c236c3ab67 Exploiting multiply-accumulate in NEON 2016-03-12 13:32:10 +01:00
Carles Fernandez
a93a01e9b1 prefetching data 2016-03-12 12:22:26 +01:00
Carles Fernandez
268f298fad More elegant workaround for 32-bit architecutres 2016-03-12 09:28:25 +01:00
Carles Fernandez
d9c333c85f Improving documentation
Adding Doxygen documentation to VOLK_GNSSSDR kernels
2016-03-10 00:56:23 +01:00
Carles Fernandez
f6e713929a Adding documentation
Copied from VOLK, with some minor changes
2016-03-09 21:01:22 +01:00
Carles Fernandez
81f4eadb5b Deleting old, unused file 2016-03-09 19:35:56 +01:00
Carles Fernandez
243d66218b Change @VERSION by @LIBVER@ 2016-03-09 18:41:02 +01:00
Carles Fernandez
250375dbd3 Fix incorrect include when building with MSVC
Keeping track of VOLK's improvements, see
65539f2691
2016-03-09 18:23:19 +01:00
Carles Fernandez
9a92672905 Fix some CMake complaints
Keeping track of VOLK's improvements, see
434c994f21
2016-03-09 18:19:05 +01:00
Carles Fernandez
cf44382afe tmpl: cast windows regs to int* calling cpuidex
Keeping track of VOLK's improvements. See
b1b69e1ae3
2016-03-09 18:15:28 +01:00
Carles Fernandez
1e9a9d1a55 reverting wrong commit 2016-03-09 15:56:07 +01:00
Carles Fernandez
59011a7772 prefetching data in the cache 2016-03-07 19:57:22 +01:00
Carles Fernandez
aac79eb78a prefetching data in the cache 2016-03-07 19:25:12 +01:00
Carles Fernandez
a3d7683c85 Fix segmentation fault of volk_gnsssdr_profile in 32-bit architectures
Temporal deactivation of the unaligned protokernel of the multiple
correlator. It does not affect receiver's performance. The commit
includes other minor fixes.
2016-03-07 18:35:40 +01:00
Carles Fernandez
b24db5d77e Fix compilation with CMake 3.5
The CMake variables CMAKE_BINARY_DIR and CMAKE_SOURCE_DIR should never be set. Now CMake 3.5 prevents the user from doing that. They have been replaced by their counterparts PROJECT_BINARY_DIR and PROJECT_SOURCE_DIR
2016-02-25 15:26:32 +01:00
Carles Fernandez
9ae59c2009 Adding missing include 2016-02-22 10:07:08 +01:00