Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
173 changes: 71 additions & 102 deletions libraries/SPI/SPI.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -299,14 +299,16 @@ void SPIClass::setClockDivider(uint32_t clockDiv) {
SPI1CLK = clockDiv;
}

inline void SPIClass::setDataBits(uint16_t bits) {
inline void SPIClass::setDataBits(uint32_t bits) {
const uint32_t mask = ~((SPIMMOSI << SPILMOSI) | (SPIMMISO << SPILMISO));
bits--;
SPI1U1 = ((SPI1U1 & mask) | ((bits << SPILMOSI) | (bits << SPILMISO)));
}

uint8_t SPIClass::transfer(uint8_t data) {
#ifndef SPI_SKIP_INITIAL_BUSY_CHECK
while(SPI1CMD & SPIBUSY) {}
#endif
// reset to 8Bit mode
setDataBits(8);
SPI1W0 = data;
Expand Down Expand Up @@ -357,7 +359,9 @@ void SPIClass::transfer(void *buf, uint16_t count) {
}

void SPIClass::write(uint8_t data) {
#ifndef SPI_SKIP_INITIAL_BUSY_CHECK
while(SPI1CMD & SPIBUSY) {}
#endif
// reset to 8Bit mode
setDataBits(8);
SPI1W0 = data;
Expand All @@ -370,7 +374,9 @@ void SPIClass::write16(uint16_t data) {
}

void SPIClass::write16(uint16_t data, bool msb) {
#ifndef SPI_SKIP_INITIAL_BUSY_CHECK
while(SPI1CMD & SPIBUSY) {}
#endif
// Set to 16Bits transfer
setDataBits(16);
if(msb) {
Expand All @@ -389,7 +395,9 @@ void SPIClass::write32(uint32_t data) {
}

void SPIClass::write32(uint32_t data, bool msb) {
#ifndef SPI_SKIP_INITIAL_BUSY_CHECK
while(SPI1CMD & SPIBUSY) {}
#endif
// Set to 32Bits transfer
setDataBits(32);
if(msb) {
Expand All @@ -408,54 +416,55 @@ void SPIClass::write32(uint32_t data, bool msb) {

/**
* Note:
* data need to be aligned to 32Bit
* or you get an Fatal exception (9)
* data needs to be aligned to 32 bit boundaries like this: char outBuffer[spiBufferSize] __attribute__ ((aligned (4)));
* if not you may get an Fatal exception (9)
* @param data uint8_t *
* @param size uint32_t
*/
void SPIClass::writeBytes(const uint8_t * data, uint32_t size) {
uint32_t localSize;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something wacky is going on with indentation here; possibly accidental tabs instead of spaces?

Copy link
Copy Markdown
Author

@KelrobGithub KelrobGithub Jan 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igrr
That's true, I'll beautify it. Seems that my editor indeed has used sometimes tabs, sometimes spaces. I've already removed the tabs in my local files, will submit them ASAP.

while(size) {
if(size > 64) {
writeBytes_(data, 64);
size -= 64;
data += 64;
} else {
writeBytes_(data, size);
size = 0;
}
}
if(size > 64)
localSize = 64;
else
localSize = size;
writeBytes_(data, localSize);
size -= localSize;
if ((size != 0) && (data)) data += localSize;
}
}

void SPIClass::writeBytes_(const uint8_t * data, uint8_t size) {
inline void SPIClass::writeBytes_(const uint8_t * data, uint32_t size) {
#ifndef SPI_SKIP_INITIAL_BUSY_CHECK
while(SPI1CMD & SPIBUSY) {}
// Set Bits to transfer
#endif
// Set bits to transfer
setDataBits(size * 8);

uint32_t * fifoPtr = (uint32_t*)&SPI1W0;
const uint32_t * dataPtr = (uint32_t*) data;
uint32_t dataSize = ((size + 3) / 4);

while(dataSize--) {
*fifoPtr = *dataPtr;
dataPtr++;
fifoPtr++;
}

__sync_synchronize();
if(data) {
memcpy((uint8_t*)&SPI1W0, data, size);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, accessing most peripheral registers other than with l32i/s32i isn't valid, but if you use memcpy with an unaligned source buffer this may very well happen.
memcpy is indeed better than a plain loop, because on Xtensa it interleaves multiple loads and stores, reducing pipeline stalls due to memory/APB access. I suggest to have a check whether data is 32-bit aligned first. If it is, then go via "fast path" using memcpy. If it's not, go via "slow path" of a loop with one memcpy per word. Unlike the old code this solution will gracefully handle unaligned source buffers, removing the alignment restriction (and for aligned buffers we will gain performance due to memcpy).

Copy link
Copy Markdown
Author

@KelrobGithub KelrobGithub Jan 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igrr
OK, a 32-bit alignment check can be done easily during run-time.

But, what exactly shall I do in case the data it is not aligned? The old code has just copied 32-bit of data in a loop which could cause an "unaligment exception".
I could go for byte-wise copying (I fear word-wise as you mentioned solves only half of the problem, doesn't it?), but as memcpy is defined for byte-wise operation by nature as it does not know anything about the data types transferred it should be safe from my point of view even for an unaligned source buffer.

Nearly as stated here: https://github.com/jcmvbkbc/u-boot-xtensa/blob/master/doc/README.unaligned-memory-access.txt

I've tested my implementation on non-aligned memory locations withount any errors by creating an aligned buffer and setting the pointer used for SPI access to the first 4 locations with identical results (see code below):

// Define buffer which is aligned
char spiIntfOutBuffer[SPI_BUFFER_SIZE] __attribute__ ((aligned (4))); 
// Set start pointer somewhere in buffer to unalign it
uint8_t* spiIntfOutBufferPtr = (uint8_t*)&(spiIntfOutBuffer[3]);
// Use start pointer for SPI transfer
spiIntfTransferBytes(outBufferPtr, NULL, SPI_ADJ_SETUP_BUFFER_SIZE);

So from my point of view there is no need for changes. Or did I miss something which is not covered by my tests?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Memcpy will always use load/store instructions with correct alignment, so unaligned load/store exceptions are not the issue here. The issue may be that memcpy will try to write a smaller than 4-byte value into the peripheral register (i.e. perform a 2-byte or 1-byte write, aligned on 2-byte or 1-byte boundary, respectively). Doing so will not cause any exception, but the SPI FIFO was not designed to support 2-byte or 1-byte writes. So you may possibly end up with incorrect data being sent.
I'll try to come up with some test case which illustrates this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some testing it appears that even though the hardware wasn't designed to accommodate 2-byte and 1-byte writes, it behaves okay for some reason.
@KelrobGithub If you can address the other two minor comments, I will happily merge this. Thanks!

}
else {
// no data buffer specified, transmit dummy data
memset((uint8_t*)&SPI1W0, 0xFFFFFFFF, size);
}
// Start SPI transfer
SPI1CMD |= SPIBUSY;
// and wait for transfer to complete
while(SPI1CMD & SPIBUSY) {}
}

/**
* @param data uint8_t *
* @param size uint8_t max for size is 64Byte
* @param repeat uint32_t
*/
void SPIClass::writePattern(const uint8_t * data, uint8_t size, uint32_t repeat) {
if(size > 64) return; //max Hardware FIFO

#ifndef SPI_SKIP_INITIAL_BUSY_CHECK
while(SPI1CMD & SPIBUSY) {}

#endif
uint32_t buffer[16];
uint8_t *bufferPtr=(uint8_t *)&buffer;
const uint8_t *dataPtr = data;
Expand Down Expand Up @@ -523,95 +532,55 @@ void SPIClass::writePattern(const uint8_t * data, uint8_t size, uint32_t repeat)
}

/**
* Note:
* in and out need to be aligned to 32 bit boundaries like this: char outBuffer[spiBufferSize] __attribute__ ((aligned (4)));
* if not you may get an Fatal exception (9)
* @param out uint8_t *
* @param in uint8_t *
* @param size uint32_t
*/
void SPIClass::transferBytes(const uint8_t * out, uint8_t * in, uint32_t size) {
uint32_t localSize;
while(size) {
if(size > 64) {
transferBytes_(out, in, 64);
size -= 64;
if(out) out += 64;
if(in) in += 64;
} else {
transferBytes_(out, in, size);
size = 0;
}
}
if(size > 64)
localSize = 64;
else
localSize = size;

transferBytes_(out, in, localSize);
size -= localSize;
if (size != 0)
{
if(out) out += localSize;
if(in) in += localSize;
}
}
}

/**
* Note:
* in and out need to be aligned to 32Bit
* or you get an Fatal exception (9)
* @param out uint8_t *
* @param in uint8_t *
* @param size uint8_t (max 64)
*/

void SPIClass::transferBytesAligned_(const uint8_t * out, uint8_t * in, uint8_t size) {
if (!size)
return;

inline void SPIClass::transferBytes_(uint8_t * out, uint8_t * in, uint32_t size) {
#ifndef SPI_SKIP_INITIAL_BUSY_CHECK
while(SPI1CMD & SPIBUSY) {}
// Set in/out Bits to transfer

#endif
// Set bits to transfer
setDataBits(size * 8);

volatile uint32_t *fifoPtr = &SPI1W0;

if (out) {
uint8_t outSize = ((size + 3) / 4);
uint32_t *dataPtr = (uint32_t*) out;
while (outSize--) {
*(fifoPtr++) = *(dataPtr++);
}
} else {
uint8_t outSize = ((size + 3) / 4);
// no out data only read fill with dummy data!
while (outSize--) {
*(fifoPtr++) = 0xFFFFFFFF;
}
}

if(out) {
memcpy((uint8_t*)&SPI1W0, out, size);
}
else {
// no out buffer specified, transmit dummy data
memset((uint8_t*)&SPI1W0, 0xFFFFFFFF, size);
}

// Start SPI transfer
SPI1CMD |= SPIBUSY;

// and wait for transfer to complete
while(SPI1CMD & SPIBUSY) {}

if (in) {
uint32_t *dataPtr = (uint32_t*) in;
fifoPtr = &SPI1W0;
int inSize = size;
// Unlike outSize above, inSize tracks *bytes* since we must transfer only the requested bytes to the app to avoid overwriting other vars.
while (inSize >= 4) {
*(dataPtr++) = *(fifoPtr++);
inSize -= 4;
in += 4;
}
volatile uint8_t *fifoPtrB = (volatile uint8_t *)fifoPtr;
while (inSize--) {
*(in++) = *(fifoPtrB++);
}
}
}


void SPIClass::transferBytes_(const uint8_t * out, uint8_t * in, uint8_t size) {
if (!((uint32_t)out & 3) && !((uint32_t)in & 3)) {
// Input and output are both 32b aligned or NULL
transferBytesAligned_(out, in, size);
} else {
// HW FIFO has 64b limit and ::transferBytes breaks up large xfers into 64byte chunks before calling this function
// We know at this point at least one direction is misaligned, so use temporary buffer to align everything
// No need for separate out and in aligned copies, we can overwrite our out copy with the input data safely
uint8_t aligned[64]; // Stack vars will be 32b aligned
if (out) {
memcpy(aligned, out, size);
}
transferBytesAligned_(out ? aligned : nullptr, in ? aligned : nullptr, size);
if (in) {
memcpy(in, aligned, size);
}
// then return received data
if(in) {
memcpy((uint8_t*)in, (uint8_t*)&SPI1W0, size);
}
}

Expand Down
11 changes: 5 additions & 6 deletions libraries/SPI/SPI.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,8 @@

#include <Arduino.h>

#define SPI_HAS_TRANSACTION 1
#define SPI_HAS_TRANSACTION
#define SPI_SKIP_INITIAL_BUSY_CHECK //comment this line for backward compatibility if really needed
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say, backwards compatibility should be the default option. Making people modify library source code to achieve compatibility is not very nice (especially for a library bundled with the core, which is installed by the IDE who knows where...).

Actually, we only need to wait for SPI to become !busy if the current transfer uses different settings (frequency, data size, data order, etc) from the previous one. So I suggest to move the decision of waiting or not for the transfer to the run time.
So maybe add a member function like _waitForSPINotBusy which would compare new settings to the ones used during previous transfer and either return (if they are the same) or wait for SPI to be !busy (if they are different).

Copy link
Copy Markdown
Author

@KelrobGithub KelrobGithub Jan 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@igrr
I do think for backwards compatibility in the same way as you do.
But in fact I think that the initial check for !busy is not needed in any of the SPI functions as every write- and transfer-function already checks that its own SPI transfer completes before leaving its context by using
while(SPI1CMD & SPIBUSY) {}
at the end of the function.

Therefore it is not possible to change frequency during a transfer.

I've only added the #define which can be commented as a service in case I've overlooked something. But currently I don't see any threat here, so maybe we can just delete the #define and the initial check completely.

Or is there an effect I currently don't know about that changing the frequency, data size or whatever activates SPI's busy flag? Then I would suggest to wait in the setFrequency or setDataBit functions but not in every transfer.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the define shouldn't be necessary, let's just remove it.


// This defines are not representing the real Divider of the ESP8266
// the Defines match to an AVR Arduino on 16MHz for better compatibility
Expand Down Expand Up @@ -75,11 +76,9 @@ class SPIClass {
void endTransaction(void);
private:
bool useHwCs;
uint8_t pinSet;
void writeBytes_(const uint8_t * data, uint8_t size);
void transferBytes_(const uint8_t * out, uint8_t * in, uint8_t size);
void transferBytesAligned_(const uint8_t * out, uint8_t * in, uint8_t size);
inline void setDataBits(uint16_t bits);
inline void writeBytes_(uint8_t * data, uint32_t size);
inline void transferBytes_(uint8_t * out, uint8_t * in, uint32_t size);
inline void setDataBits(uint32_t bits);
};

#if !defined(NO_GLOBAL_INSTANCES) && !defined(NO_GLOBAL_SPI)
Expand Down