Fix various linting errors, both for cpplint and
then clang-format. For the latter, used
clang-format-19 to quickly fix it all.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement 32-bit floating-point complex number vector generator
using a fixed phase increment on each element utilizing
RVV C intrinsics.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement experimental 32-bit floating-point number generation
of complex numbers using sin and cos utilizing RVV C intrinsics.
To be fully honest, simply reverse-engineered the NEON implementation,
and not entirely sure how the final two big calculations ("extended
precision modular arithmetic" and calculating both polynomials)
were derived. As such, it is possible there is a method that better
utilizes the capabilities of RVV, and welcome any change that would
be able to take advantage of it.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Now, actually implement 32fc x2 rotator dot product and its associated
puppet utilizing RVV C intrinsics.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement 32-bit floating-point complex number rotator and dot product
with input scalar 32-bit floating-point complex vectors utilizing RVV
C intrinsics.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
In process, attempted experimental optimization.
Specifically, at point where would perform dot product, loaded the
accumulators in a vectorized way so that can simply rotate through
the accumulators instead of having to reload each one at a time.
However, ended up actually making the test slower. This
may, however, be a symptom of the relatively small test, so
it may be worth keeping an eye out in case use cases
turn out to mainly be for bigger vectors.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement rotator of 16-bit integer values using
RVV intrinsics, along with its corresponding puppet
to enable testing.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Replace the wordy pseudoinstrict of `vset` with the
more concise (at least in these use cases) one of
`vcreate`.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement high dynamics resampler, which seems to include a second-order
term in calculating the index as well as using a simple for loop
after calculating the first resample that just `memcpy`s at
an offset for the adjacent correlators.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement fast 16-bit integer complex number xn resampler along
with its corresponding puppet. Essentially just runs the
fast resampler code in a for loop for each desired output.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement experiement fast resampler for 16-bit integer complex numbers,
which uses the extra prerequirement of phase never reaching more than
twice the length of `local_code` to sidestep all slow division steps
and instead use simple branching and addition/subtraction.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Optimize and clarify that `inPtr` is not stripmined in xn resamplers
by moving the declaration of the pointer out of the loop. This
may or may not remove an additional store or give a hint to the compiler
to keep the pointer value handy. More importantly, it clarifies
that the input data in this resampler is specifically not being
stripmined through, instead being referenced as the offsets buffer
and output are being stripmined through.
Also experimentally optimize xn resampler by doing remainder
and conversion (to address offset) in vector calculations instead
of in for loop.
Finally optimize xn resampler functions, by moving the
wrapping of the index and its conversion to a raw
address offset is moved from a basic for loop to vector
computation.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement 32-bit floating-point complex number resampler using
RVV intrinsics using the same trick that worked with 16-bit
integer complex numbers. Namely, interpreting the two
32-bit components of the complex numbers as a single
64-bit number for the purposes of moving around.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement 16-bit integer complex number resampler
using RVV intrinsics by interpreting each complex number
(with two 16-bit components) as a single 32-bit integer
for the purposes of moving them around.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement 16-bit integer xn resampler with a slightly
modifded copy of 32f xn resampler logic, along with the
16i resamplerxnpupper to allow it to be tested.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement 32-bit floating-point xn resampler
utilizing RVV intrinsics. It essentially uses the generic
logic except for the actual transfer part. Instead, it
saves the indices to sample from in a temporary buffer
that it then uses in RVV to do a simple unordered
indexed load into a unit-stride store.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Optimize 32-bit floating-point complex number to 8-bit
integer complex number conversion using RVV by skipping
the need to segment load/store by simply converting the raw
numbers stored within the complex numbers.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Optimize 32-bit floating-point complex numbers to 16-bit
integer complex number conversion using RVV intrinsics,
by not segment storing/loading and instead converting
the raw numbers.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Optimize conversion from 16-bit integer complex numbers to
32-bit floating-point complex numbers using RVV intrinsics
by skipping any need to distinguish between real and imaginary
numbers and instead just loading a contiguous selectiong
of components.
Timing results:
Old - generic: 39736.8 ms, rvv: 10406.7 ms
New - generic: 40479.9 ms, rvv: 8245.38 ms
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Make 32-bit floating-point complex number to 8-bit integer
complex number conversion consistent with the generic
implementation. Specifically, made it so, after
being multiplied by `INT8_MAX`, but before narrowing,
saturate each number to 8 bits.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Make 32-bit floating point complex number to 16-bit integer
complex number conversion using RVV intrinsics consistent
with the generic implementation and documentation
by saturating the value within 16-bits before converting.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Implement converting a vector of 32-bit floating point
complex numbers into a vector of 8-bit integer complex numbers
using RVV C intrinsics in `volk_gnsssdr_32fc_convert_8i_rvv`.
Required a lot of debugging, with attempts ranging from
saturation to not doing a narrowing conversion to using
the specification-recommended "round-to-odd" conversion.
In end, the problem was that the generic
implementation, with no notice in the function documentation
at all, actually multiplies the floating-point number by `INT8_MAX`
before converting.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>
Since RVV 32fc to 8i is conversion is not working, try implementing
a smaller scale conversion of a vector of 32-bit floating-point
complex numbers to 16-bit integer complex numbers.
Signed-off-by: Marcus Alagar <mvala079@gmail.com>