Carles Fernandez
57107cf86d
Introducing a new resampler kernel for comparison
2016-04-01 12:41:00 +02:00
Carles Fernandez
7658f64527
adding unaligned protokernels
2016-04-01 10:36:52 +02:00
Carles Fernandez
5841258a36
bug fix
2016-04-01 01:51:58 +02:00
Carles Fernandez
99ceb30a0e
Fixes related to the MSVC compiler
2016-04-01 01:50:09 +02:00
Carles Fernandez
684073bef6
adding NEON puppet
2016-04-01 01:47:51 +02:00
Carles Fernandez
43330588a8
adding include for neon intrinsics
2016-04-01 01:25:57 +02:00
Carles Fernandez
f6cfc64cf7
adding NEON protokernel
2016-04-01 01:21:35 +02:00
Carles Fernandez
a71b118170
Workaround for problems with orc
2016-03-31 21:35:05 +02:00
Carles Fernandez
817139ba50
ask for aligned memory in a more portable way
2016-03-31 19:39:37 +02:00
Carles Fernandez
36660e05ca
Removing unused class (nco_lib, replaced by volk kernels)
2016-03-30 22:37:43 +02:00
Carles Fernandez
4f3273f296
code cleaning, removing tabs
2016-03-30 22:27:12 +02:00
Carles Fernandez
9eb175fb0e
Adding new resampler kernel and integrating it in the multicorrelator
2016-03-30 21:33:43 +02:00
Carles Fernandez
d8b45d9b79
fix wrong storeu by store
2016-03-28 13:51:51 +02:00
Carles Fernandez
78372ba2e9
adding _mm256_zeroupper() at the end of AVX and AVX2 protokernels
...
This avoids penalties for state transitions from 256-bit x86-AVX
instructions to x86-SSE instructions
2016-03-28 11:58:01 +02:00
Carles Fernandez
b1d99d58ec
fix typo in puppet initialization
...
The AVX2 protokernel achieves an acceleration factor x13.
2016-03-28 10:14:32 +02:00
Carles Fernandez
26e68e89f2
adding AVX2 protokernels (aligned and unaligned)
2016-03-28 09:42:55 +02:00
Carles Fernandez
7c1f5723e6
remove unneeded stores in NEON protokernels
2016-03-27 13:00:04 +02:00
Carles Fernandez
d113835073
adding new kernel: volk_gnsssdr_32fc_x2_rotator_dot_prod_32fc_xn
...
Including generic, SSE3 (aligned and unaligned), AVX (aligned and
unaligned) and NEON protokernels.
2016-03-27 12:50:53 +02:00
Carles Fernandez
751764343c
adding AVX2 protokerns
...
I haven't found a way to do the rotator part better than with SSE3. Only
the dot product takes real advantage of 256-bit registers. Even tough,
the gain with respect to SSE3 is about 12%.
2016-03-26 01:51:01 +01:00
Carles Fernandez
d987a04d42
adding AVX2 protokernels
2016-03-22 18:03:34 +01:00
Carles Fernandez
bd6c028ec4
bug fix
...
writing to the input pointer was having bad consequences (random fails
in other kernels)
2016-03-22 18:00:56 +01:00
Javier Arribas
0e47d97dec
Adding a missing include in gnsssdr volk kernel library (volk_gnsssdr_16ic_x2_rotator_dot_prod_16ic_xn)
2016-03-21 16:07:16 +01:00
Carles Fernandez
703de227a2
fix typo
2016-03-21 00:44:09 +01:00
Carles Fernandez
a908804b44
fix phase computation in the tail items of the NEON protokernel
2016-03-21 00:40:36 +01:00
Carles Fernandez
1983562496
The sincos kernel now accepts an initial phase
2016-03-21 00:38:08 +01:00
Carles Fernandez
485a405bab
Adding new neon kernel and solving x86 issues
...
Managing memory with volk_gnsssdr instead of malloc and free. This seems
to solve runtime problems (segmentation faults) in i386 (32 bit)
architectures.
2016-03-20 13:11:53 +01:00
Carles Fernandez
883cf629d1
Adding new NEON protokernel
...
Try another strategy based on multiply-and-accumulate for the dot
product. In all SIMD protokernels, managing memory with
volk_gnsssdr_malloc and volk_gnsssdr_free instead of calloc and free
2016-03-20 12:23:45 +01:00
Carles Fernandez
fa292961c1
Fix neon protokernel
2016-03-20 01:50:04 +01:00
Carles Fernandez
9c8fc9436e
Adding and integrating sincos kernel
2016-03-20 01:45:01 +01:00
Carles Fernandez
2be266cc71
adding sincos kernel
2016-03-19 21:41:19 +01:00
Carles Fernandez
c9ff9759cc
Fixing some numerical problems
2016-03-18 19:46:18 +01:00
Carles Fernandez
2cf1ea85af
fix app name in help message
2016-03-13 12:07:40 +01:00
Carles Fernandez
87dc56e147
Using vector multiply-accumulate in NEON kernels
...
Aprox 10% of improvememnt
2016-03-12 21:47:35 +01:00
Carles Fernandez
f7c1c9ce43
Using multiply-accumulate in NEON
2016-03-12 19:30:00 +01:00
Carles Fernandez
c236c3ab67
Exploiting multiply-accumulate in NEON
2016-03-12 13:32:10 +01:00
Carles Fernandez
a93a01e9b1
prefetching data
2016-03-12 12:22:26 +01:00
Carles Fernandez
268f298fad
More elegant workaround for 32-bit architecutres
2016-03-12 09:28:25 +01:00
Carles Fernandez
d9c333c85f
Improving documentation
...
Adding Doxygen documentation to VOLK_GNSSSDR kernels
2016-03-10 00:56:23 +01:00
Carles Fernandez
f6e713929a
Adding documentation
...
Copied from VOLK, with some minor changes
2016-03-09 21:01:22 +01:00
Carles Fernandez
81f4eadb5b
Deleting old, unused file
2016-03-09 19:35:56 +01:00
Carles Fernandez
243d66218b
Change @VERSION by @LIBVER@
2016-03-09 18:41:02 +01:00
Carles Fernandez
250375dbd3
Fix incorrect include when building with MSVC
...
Keeping track of VOLK's improvements, see
65539f2691
2016-03-09 18:23:19 +01:00
Carles Fernandez
9a92672905
Fix some CMake complaints
...
Keeping track of VOLK's improvements, see
434c994f21
2016-03-09 18:19:05 +01:00
Carles Fernandez
cf44382afe
tmpl: cast windows regs to int* calling cpuidex
...
Keeping track of VOLK's improvements. See
b1b69e1ae3
2016-03-09 18:15:28 +01:00
Carles Fernandez
1e9a9d1a55
reverting wrong commit
2016-03-09 15:56:07 +01:00
Carles Fernandez
59011a7772
prefetching data in the cache
2016-03-07 19:57:22 +01:00
Carles Fernandez
aac79eb78a
prefetching data in the cache
2016-03-07 19:25:12 +01:00
Carles Fernandez
a3d7683c85
Fix segmentation fault of volk_gnsssdr_profile in 32-bit architectures
...
Temporal deactivation of the unaligned protokernel of the multiple
correlator. It does not affect receiver's performance. The commit
includes other minor fixes.
2016-03-07 18:35:40 +01:00
Carles Fernandez
b24db5d77e
Fix compilation with CMake 3.5
...
The CMake variables CMAKE_BINARY_DIR and CMAKE_SOURCE_DIR should never be set. Now CMake 3.5 prevents the user from doing that. They have been replaced by their counterparts PROJECT_BINARY_DIR and PROJECT_SOURCE_DIR
2016-02-25 15:26:32 +01:00
Carles Fernandez
9ae59c2009
Adding missing include
2016-02-22 10:07:08 +01:00