Carles Fernandez
684073bef6
adding NEON puppet
2016-04-01 01:47:51 +02:00
Carles Fernandez
43330588a8
adding include for neon intrinsics
2016-04-01 01:25:57 +02:00
Carles Fernandez
f6cfc64cf7
adding NEON protokernel
2016-04-01 01:21:35 +02:00
Carles Fernandez
a71b118170
Workaround for problems with orc
2016-03-31 21:35:05 +02:00
Carles Fernandez
817139ba50
ask for aligned memory in a more portable way
2016-03-31 19:39:37 +02:00
Carles Fernandez
36660e05ca
Removing unused class (nco_lib, replaced by volk kernels)
2016-03-30 22:37:43 +02:00
Carles Fernandez
4f3273f296
code cleaning, removing tabs
2016-03-30 22:27:12 +02:00
Carles Fernandez
9eb175fb0e
Adding new resampler kernel and integrating it in the multicorrelator
2016-03-30 21:33:43 +02:00
Carles Fernandez
d8b45d9b79
fix wrong storeu by store
2016-03-28 13:51:51 +02:00
Carles Fernandez
78372ba2e9
adding _mm256_zeroupper() at the end of AVX and AVX2 protokernels
...
This avoids penalties for state transitions from 256-bit x86-AVX
instructions to x86-SSE instructions
2016-03-28 11:58:01 +02:00
Carles Fernandez
b1d99d58ec
fix typo in puppet initialization
...
The AVX2 protokernel achieves an acceleration factor x13.
2016-03-28 10:14:32 +02:00
Carles Fernandez
26e68e89f2
adding AVX2 protokernels (aligned and unaligned)
2016-03-28 09:42:55 +02:00
Carles Fernandez
7c1f5723e6
remove unneeded stores in NEON protokernels
2016-03-27 13:00:04 +02:00
Carles Fernandez
d113835073
adding new kernel: volk_gnsssdr_32fc_x2_rotator_dot_prod_32fc_xn
...
Including generic, SSE3 (aligned and unaligned), AVX (aligned and
unaligned) and NEON protokernels.
2016-03-27 12:50:53 +02:00
Carles Fernandez
751764343c
adding AVX2 protokerns
...
I haven't found a way to do the rotator part better than with SSE3. Only
the dot product takes real advantage of 256-bit registers. Even tough,
the gain with respect to SSE3 is about 12%.
2016-03-26 01:51:01 +01:00
Carles Fernandez
d987a04d42
adding AVX2 protokernels
2016-03-22 18:03:34 +01:00
Carles Fernandez
bd6c028ec4
bug fix
...
writing to the input pointer was having bad consequences (random fails
in other kernels)
2016-03-22 18:00:56 +01:00
Javier Arribas
0e47d97dec
Adding a missing include in gnsssdr volk kernel library (volk_gnsssdr_16ic_x2_rotator_dot_prod_16ic_xn)
2016-03-21 16:07:16 +01:00
Carles Fernandez
703de227a2
fix typo
2016-03-21 00:44:09 +01:00
Carles Fernandez
a908804b44
fix phase computation in the tail items of the NEON protokernel
2016-03-21 00:40:36 +01:00
Carles Fernandez
1983562496
The sincos kernel now accepts an initial phase
2016-03-21 00:38:08 +01:00
Carles Fernandez
485a405bab
Adding new neon kernel and solving x86 issues
...
Managing memory with volk_gnsssdr instead of malloc and free. This seems
to solve runtime problems (segmentation faults) in i386 (32 bit)
architectures.
2016-03-20 13:11:53 +01:00
Carles Fernandez
883cf629d1
Adding new NEON protokernel
...
Try another strategy based on multiply-and-accumulate for the dot
product. In all SIMD protokernels, managing memory with
volk_gnsssdr_malloc and volk_gnsssdr_free instead of calloc and free
2016-03-20 12:23:45 +01:00
Carles Fernandez
fa292961c1
Fix neon protokernel
2016-03-20 01:50:04 +01:00
Carles Fernandez
9c8fc9436e
Adding and integrating sincos kernel
2016-03-20 01:45:01 +01:00
Carles Fernandez
2be266cc71
adding sincos kernel
2016-03-19 21:41:19 +01:00
Carles Fernandez
c9ff9759cc
Fixing some numerical problems
2016-03-18 19:46:18 +01:00
Carles Fernandez
2cf1ea85af
fix app name in help message
2016-03-13 12:07:40 +01:00
Carles Fernandez
87dc56e147
Using vector multiply-accumulate in NEON kernels
...
Aprox 10% of improvememnt
2016-03-12 21:47:35 +01:00
Carles Fernandez
f7c1c9ce43
Using multiply-accumulate in NEON
2016-03-12 19:30:00 +01:00
Carles Fernandez
c236c3ab67
Exploiting multiply-accumulate in NEON
2016-03-12 13:32:10 +01:00
Carles Fernandez
a93a01e9b1
prefetching data
2016-03-12 12:22:26 +01:00
Carles Fernandez
268f298fad
More elegant workaround for 32-bit architecutres
2016-03-12 09:28:25 +01:00
Carles Fernandez
d9c333c85f
Improving documentation
...
Adding Doxygen documentation to VOLK_GNSSSDR kernels
2016-03-10 00:56:23 +01:00
Carles Fernandez
f6e713929a
Adding documentation
...
Copied from VOLK, with some minor changes
2016-03-09 21:01:22 +01:00
Carles Fernandez
81f4eadb5b
Deleting old, unused file
2016-03-09 19:35:56 +01:00
Carles Fernandez
243d66218b
Change @VERSION by @LIBVER@
2016-03-09 18:41:02 +01:00
Carles Fernandez
250375dbd3
Fix incorrect include when building with MSVC
...
Keeping track of VOLK's improvements, see
65539f2691
2016-03-09 18:23:19 +01:00
Carles Fernandez
9a92672905
Fix some CMake complaints
...
Keeping track of VOLK's improvements, see
434c994f21
2016-03-09 18:19:05 +01:00
Carles Fernandez
cf44382afe
tmpl: cast windows regs to int* calling cpuidex
...
Keeping track of VOLK's improvements. See
b1b69e1ae3
2016-03-09 18:15:28 +01:00
Carles Fernandez
1e9a9d1a55
reverting wrong commit
2016-03-09 15:56:07 +01:00
Carles Fernandez
59011a7772
prefetching data in the cache
2016-03-07 19:57:22 +01:00
Carles Fernandez
aac79eb78a
prefetching data in the cache
2016-03-07 19:25:12 +01:00
Carles Fernandez
a3d7683c85
Fix segmentation fault of volk_gnsssdr_profile in 32-bit architectures
...
Temporal deactivation of the unaligned protokernel of the multiple
correlator. It does not affect receiver's performance. The commit
includes other minor fixes.
2016-03-07 18:35:40 +01:00
Carles Fernandez
b24db5d77e
Fix compilation with CMake 3.5
...
The CMake variables CMAKE_BINARY_DIR and CMAKE_SOURCE_DIR should never be set. Now CMake 3.5 prevents the user from doing that. They have been replaced by their counterparts PROJECT_BINARY_DIR and PROJECT_SOURCE_DIR
2016-02-25 15:26:32 +01:00
Carles Fernandez
9ae59c2009
Adding missing include
2016-02-22 10:07:08 +01:00
Carles Fernandez
11c84ed8ad
Fixing kernels
2016-02-19 11:03:24 +01:00
Carles Fernandez
b2a654c646
fix typos
2016-02-14 15:06:45 +01:00
Carles Fernandez
1930f02c4f
saving one register in neon implementation
2016-02-14 15:02:17 +01:00
Carles Fernandez
6156f4b3de
some small fixes
2016-02-14 14:52:26 +01:00
Carles Fernandez
e8dfd860fb
prefetching data in neon implementation
...
5% of average improvement
2016-02-13 14:26:40 +01:00
Carles Fernandez
a4e2ceb9c4
Adding neon implementation
...
Input data have been re-scaled to avoid saturation problems
2016-02-13 14:16:40 +01:00
Javier Arribas
d4d73e24c1
Fixing some includes in volk gnsssdr kernels
2016-02-12 12:36:08 +01:00
Carles Fernandez
e400885800
Fixing puppets
...
In kernels which output is shorter than num_points, memory is firtly
filled bu zeros and then the kernel is executed.
2016-02-11 21:15:46 +01:00
Javier Arribas
7f9dccd386
generic implementation simplification in volk gnsssdr kernel module
2016-02-11 17:57:03 +01:00
Carles Fernandez
7d0e3126aa
Merge branch 'next' of git+ssh://github.com/gnss-sdr/gnss-sdr into next
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2016-02-09 19:43:07 +01:00
Javier Arribas
9bf4710679
Added a new volk_gnsssdr kernel that integrates both the phase rotator
...
and n dot_product kernels. Enabled in cpu_multicorrelator_16sc
2016-02-09 11:49:18 +01:00
Carles Fernandez
794d141e84
Improved processor/feature detection when building with MSVC
2016-02-07 10:56:21 +01:00
Carles Fernandez
844c33d699
improving documentation
2016-01-31 23:21:28 +01:00
Carles Fernandez
bb54222883
improving documentation
2016-01-31 23:13:10 +01:00
Carles Fernandez
213486c2eb
improving documentation
2016-01-31 19:36:48 +01:00
Carles Fernandez
833fe313c7
Improving documentation
2016-01-31 18:13:03 +01:00
Carles Fernandez
f4875012df
prefetch data in the cache in neon implementation
...
8% of average improvement
2016-01-31 10:41:51 +01:00
Carles Fernandez
8a6c4d767f
ask for aligned memory in neon implementation
...
1% improvement
2016-01-31 10:39:24 +01:00
Carles Fernandez
4fcffa2bdd
some improvements
...
phase computation was correclty done in SSE implementation but not in
NEON. Ask for aligned memory in NEON implementation. Some code cleaning
2016-01-31 09:49:50 +01:00
Carles Fernandez
db321d1c2e
Fixing missing phase increment in SIMD implementations
...
After computing the rotation with SIMD instructions, we were not
incrementing the phase step, so the first iteration in the 'c region'
had the same phase than the last sample computed with SIMD instructions.
This commit fix the bux in SSE3 and NEON implementations
2016-01-29 19:42:30 +01:00
Carles Fernandez
8c07815852
fix missing time step in neon implementation
2016-01-29 19:30:31 +01:00
Javier Arribas
a26255270e
Optimized SSE3 16ic rotator volk_gnsssdr module
2016-01-29 18:43:44 +01:00
Carles Fernandez
ccbdcf8788
adding neon implementation
...
about x10 acceleration
2016-01-28 23:36:19 +01:00
Carles Fernandez
d69e8e34f6
adding neon implementation
2016-01-28 19:45:31 +01:00
Carles Fernandez
2014149e17
Merge branch 'next' of git+ssh://github.com/gnss-sdr/gnss-sdr into next
...
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2016-01-28 18:10:21 +01:00
Javier Arribas
d2898c40ce
Added SSE2 implementation for volk_gnss-sdr 16ic phase rotator. Bug fix
...
in volk_gnss-sdr rotator puppet unit test.
2016-01-28 16:42:19 +01:00
Carles Fernandez
4e12f6ee5a
adding definition of new volk_gnsssdr kernel: 16-bit complex rotator
...
generic implementation only
2016-01-27 18:34:20 +01:00
Carles Fernandez
1d9fc3ceae
adding neon implementation
2016-01-25 20:53:02 +01:00
Carles Fernandez
ba8f0e86b2
adding neon implementation
2016-01-25 18:13:54 +01:00
Carles Fernandez
3306c21cf8
adding neon implementation
2016-01-24 20:10:12 +01:00
Carles Fernandez
da67f85f6c
remove unused variable in neon implementation
2016-01-24 14:38:34 +01:00
Carles Fernandez
b18fc5835c
fix implementation
2016-01-24 14:37:19 +01:00
Carles Fernandez
377acfc322
add neon implementation
2016-01-24 14:30:33 +01:00
Carles Fernandez
2d21706041
add neon implementation
2016-01-24 13:02:02 +01:00
Carles Fernandez
812a4df93f
add neon implementation
2016-01-24 12:01:40 +01:00
Carles Fernandez
cd2f0b86f6
add neon implementation
2016-01-23 21:22:30 +01:00
Carles Fernandez
c5252da7fd
adding neon implementation
2016-01-23 21:05:28 +01:00
Carles Fernandez
a49cf3a98f
missing include
2016-01-22 16:52:43 +01:00
Carles Fernandez
7bf4bfd7dc
adding neon implementation
2016-01-22 12:29:08 +01:00
Carles Fernandez
642018bada
tagging version
2016-01-22 10:14:43 +01:00
Carles Fernandez
8159c1ec22
Adding README and .gitignore
2016-01-22 02:02:23 +01:00
Carles Fernandez
61e36aa2a0
copy GPLv3 license from gnss-sdr
2016-01-22 00:54:52 +01:00
Carles Fernandez
4553b5e643
cleaning includes
2016-01-21 23:30:24 +01:00
Carles Fernandez
d139e6d93a
Using limits.h instead of hardcoded values
2016-01-21 12:30:46 +01:00
Javier Arribas
62a17dc3d7
Replaced literal limits with values stored in limits.h for volk gnss-sdr
...
kernel
2016-01-21 11:30:09 +01:00
Javier Arribas
02a6f41794
Fix seg fault on some architectures in gnss-sdr volk 32fc convert to
...
16ic module
2016-01-21 11:21:25 +01:00
Carles Fernandez
3ce1bba194
Fix execution of puppets when compiled with clang
2016-01-21 01:40:29 +01:00
Carles Fernandez
577f7f1940
fixes CMake warning under Linux
2016-01-21 00:42:17 +01:00
Carles Fernandez
f6cb32bc9f
cleaning
2016-01-21 00:25:53 +01:00
Carles Fernandez
88752588b6
remove duplicated copyright text
2016-01-21 00:12:14 +01:00
Carles Fernandez
53179468bd
Removing unused constant
2016-01-20 23:42:25 +01:00
Carles Fernandez
fe4cce043d
Remove unnecessary code, making it closer to the original VOLK
...
GNSS-SDR-spefific additions are clearly marked, so it will be easier to
follow their changes and to add other specific features
2016-01-20 20:14:42 +01:00
Carles Fernandez
497c856437
add unaligned version
2016-01-20 18:38:33 +01:00
Carles Fernandez
c7193e394e
Merge branch 'new_volk_module' of git+ssh://github.com/gnss-sdr/gnss-sdr
...
into new_volk_module
# Please enter a commit message to explain why this merge is necessary,
# especially if it merges an updated upstream into a topic branch.
#
# Lines starting with '#' will be ignored, and an empty message aborts
# the commit.
2016-01-20 18:18:32 +01:00
Carles Fernandez
9bf8b174ba
Sout out the aligned/unaligned thing in old kernels
2016-01-20 18:16:09 +01:00
Javier Arribas
07feeeee3a
New volk_gnss_sdr kernel: Fast conversion between 16 bit int complex to
...
32 bits floating point complex
2016-01-20 17:45:47 +01:00
Javier Arribas
e92f409897
Added SSE2 unaligned versions of volk_gnss-sdr dot product and resampler
...
kernels.
2016-01-20 15:53:09 +01:00
Carles Fernandez
7a5574f598
Fixing aligned/unaligned tag
2016-01-20 11:21:58 +01:00
Carles Fernandez
0215748638
Adding a puppet for the multiple correlator
2016-01-19 12:42:55 +01:00
Carles Fernandez
bbe0f37910
fixing result reading in puppet
2016-01-19 11:53:46 +01:00
Carles Fernandez
bf0a37960f
adding a puppet for the multiple resampler
2016-01-19 10:45:56 +01:00
Carles Fernandez
090f6524db
Merge branch 'new_volk_module' of git+ssh://github.com/gnss-sdr/gnss-sdr
...
into new_volk_module
# Conflicts:
# src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/kernels/volk_gnsssdr/volk_gnsssdr_16ic_x2_dot_prod_16ic.h
2016-01-19 00:01:26 +01:00
Carles Fernandez
1d18ff6c16
avoiding redefinition of functions
2016-01-18 23:50:34 +01:00
Carles Fernandez
57c05e3cf0
adding a puppet for the 16-bit complex resampler
2016-01-18 23:44:10 +01:00
Carles Fernandez
e53e85f41b
fixing a wrong fix :-P
2016-01-18 20:59:51 +01:00
Javier Arribas
4ba75a3fbe
Still some bugs to fix in 16sc dot product. All fixed now.
2016-01-18 10:43:09 +01:00
Carles Fernandez
fd2af02aec
fix sse implementation
2016-01-16 23:15:19 +01:00
Carles Fernandez
a2429a851c
fix sse implementation
2016-01-16 22:52:10 +01:00
Carles Fernandez
cd80beb16c
fix sse implementation
2016-01-16 22:49:34 +01:00
Carles Fernandez
3d3a758ef2
fix sse implementation
2016-01-16 22:48:29 +01:00
Carles Fernandez
46e3ce5ec2
fix sse implementations
2016-01-16 22:39:15 +01:00
Carles Fernandez
38d4d8aa9a
fix sse implementations
2016-01-16 20:57:55 +01:00
Carles Fernandez
a817d49e89
fix that makes pass the test
2016-01-16 14:29:15 +01:00
Carles Fernandez
dab4da064c
Updating documentation
2016-01-16 14:11:12 +01:00
Carles Fernandez
3eab3b58c6
Removing garbage
2016-01-15 18:25:05 +01:00
Javier Arribas
fb42cda826
Range reduced to 4 bits in the volk short int test input to avoid
...
saturation of vector dot products.
Reduced test vector sizes to 8111 to avoid saturation.
2016-01-14 18:56:22 +01:00
Carles Fernandez
5fdbb472f6
required by memset
2016-01-13 20:09:27 +01:00
Carles Fernandez
e57d02321d
Merge branch 'new_volk_module' of git+ssh://github.com/gnss-sdr/gnss-sdr
...
into new_volk_module
# Conflicts:
# src/algorithms/libs/volk_gnsssdr_module/volk_gnsssdr/kernels/volk_gnsssdr/volk_gnsssdr_16ic_x2_dot_prod_16ic_xn.h
# src/algorithms/tracking/libs/volk_gnsssdr_16ic_xn_resampler_16ic_xn.h
2016-01-13 20:04:18 +01:00
Carles Fernandez
ae2b594c3b
Moving two kernels to volk_gnsssdr. Still no testing
2016-01-13 19:38:07 +01:00
Javier Arribas
5d0186eee1
Renamed saturated arithmetic library and some code cleaning and
...
refresh documentation in new gnsssdr volk modules
2016-01-13 15:37:58 +01:00
Carles Fernandez
f88f222ef6
fixes cmake warning in MacOSX
2016-01-13 15:34:50 +01:00
Carles Fernandez
ece7bc2c65
Integrating a new volk kernel
2016-01-13 11:42:01 +01:00
Carles Fernandez
f659005e63
Remove duplicated line
2016-01-13 11:19:07 +01:00
Carles Fernandez
bbdf52dbed
fixes for MacOS
2016-01-13 11:04:32 +01:00
Carles Fernandez
97ed762964
some fixes
2016-01-13 10:22:06 +01:00
Carles Fernandez
62f23a7a2b
Fix warning (unused variable)
2016-01-12 23:22:50 +01:00
Carles Fernandez
551735e034
Fixes warning about posix_memalign
2016-01-12 23:19:09 +01:00
Carles Fernandez
48e9ada2e1
fix warning
2016-01-12 23:12:04 +01:00
Carles Fernandez
eb1dbfe37b
introducing new kernels
2016-01-12 22:48:59 +01:00
Carles Fernandez
1bf645c98c
adding a missing kernel
2016-01-12 21:03:51 +01:00
Carles Fernandez
24909510e7
Updating volk_gnsssdr to the new volk scheme
2016-01-12 20:15:16 +01:00
Carles Fernandez
f4584a12c1
Fixing a couple of warnings
...
unused parameter '_chip_shift', comparison between signed and unsigned
integers. Regular casts have been replaced by static casts.
2016-01-11 11:32:41 +01:00
Carles Fernandez
2697fb6198
Cleaning includes
2016-01-10 22:21:31 +01:00
Carles Fernandez
9e0c1bb719
remove unnedded headers
2016-01-03 15:22:52 +01:00
Carles Fernandez
17517625be
fixing includes
2015-12-01 13:30:03 +01:00
Carles Fernandez
49d974db77
Avoids redefinition of constants
2015-09-25 23:51:42 +02:00
Carles Fernandez
b665444550
Fix linking against
...
18ed0f15bc
2015-09-18 17:40:47 +02:00
Carles Fernandez
a84b4baef0
Removing cudahelpers library and usage by a copyright issue. It does not
...
affect functionality.
2015-09-10 17:46:38 +02:00
Carles Fernandez
e0669ba93d
Fixes warning about posix_memalign
2015-09-05 13:05:53 +02:00
Carles Fernandez
6febea48fa
bumping version number
2015-09-02 00:38:46 +02:00
Carles Fernandez
0821216970
Moving cudahelpers headers so other blocks can use it more easily.
2015-08-25 20:49:37 +02:00
Carles Fernandez
a8bc6e7cc7
fixing coverity scan parse warnings
2015-07-12 14:14:11 +02:00
Javier
26a6bbd37a
bug found in PRN resampler code. Disabled optimization
2015-06-12 19:28:56 +02:00
Carles Fernandez
4c0243580b
fixing incorrect expression
2015-06-01 19:27:58 +02:00