Lwip mainstream#826
Conversation
Switch the lwIP dependency from the ps2dev fork (ps2-v2.0.3 branch) to the upstream lwip-tcpip/lwip repo at STABLE-2_0_3_RELEASE, and carry the needed accommodations in our own code so the lwIP source tree stays unmodified: - Pre-include <sys/time.h> in ee/iop tcpip Makefiles and in ps2ip_internal.h. Upstream lwIP's sockets.h uses struct timeval without including sys/time.h when LWIP_TIMEVAL_PRIVATE=0. - Provide a minimal IOP stdlib.h that defines atoi via sysclib's strtol. Upstream lwIP expects atoi to be declared. - Set LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT=1 in iop lwipopts.h so mem_free() uses SYS_ARCH_PROTECT rather than a mutex. This replaces the ps2dev fork's mem.c patch and is required because SMAP RX calls pbuf_free() in interrupt-disabled context. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CopyToFIFO and CopyFromFIFO did ((u32 *)buffer)[i / 4], which assumes a 4-byte aligned buffer. On real hardware this trapped an IOP AdEL (address error on load) exception when lwIP handed SMAP a pbuf whose payload was only halfword-aligned, observed on the PIO fallback path where length < 64 so SmapDmaTransfer returns 0 and the entire buffer is copied through PIO. Reproducible crash signature: Module "SMAP_driver" Relative EPC 0x0000227C, Cause 0x30000010 (ExcCode=4 AdEL), BadVAddr ending in 2/6/A/E. Detect unaligned buffers and take a byte-wise word-assembly path; GCC recognises the pattern and emits lwl/lwr. Also skip the SPD DMA path for unaligned buffers since the DMA controller likewise requires 4-byte alignment on source/destination. The aligned fast path is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream lwIP has been >= 1.3.0 for well over a decade and lwIP 2.0.3 is the minimum target on this branch. Drop the unreachable SMapOutput() wrapper and its associated ifdef forks in SMapIFInit; smap-ps2ip now unconditionally uses etharp_output and NETIF_FLAG_ETHARP. No functional change: builds only the BUILDING_SMAP_PS2IP variant; the netman/netdev smap.irx binaries are byte-identical (shasum unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PS2 networking targets a local LAN with MTU=1500, and TCP negotiates MSS=1460 so IP fragmentation never arises in the common case. Turn off IP_REASSEMBLY and IP_FRAG to drop the corresponding lwIP code paths and free their static pool slots (MEMP_NUM_REASSDATA / MEMP_NUM_FRAG_PBUF). Savings observed in ps2ip-nm.irx: 94805 -> 91509 bytes (-3.3 KB of code), plus runtime pool memory that no longer needs to be reserved. ps2link end-to-end test (execee over TCP) confirmed unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…F_FLAG_ETHERNET Symmetric with the earlier smap/main.c cleanup: lwIP 2.0.3 is well past 1.3.0, so the SMapOutput() wrapper and ifdef forks in SMapIFInit are dead code. Replace the conditional flag block with a single line, and add NETIF_FLAG_ETHERNET to correctly mark the netif as Ethernet — lwIP 2.0.3 defines this flag in netif.h:95 and certain code paths (e.g. when LWIP_ARP is disabled) branch on it. ps2link end-to-end test (execee over TCP) confirmed unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reflow the ps2api_OBJECTS, ps2api_IPV4 and ps2ip_OBJECTS lists in both ee/network/tcpip/Makefile and iop/tcpip/tcpip/Makefile onto one object per line. Makes future additions (and diffs) easier to review without changing any compile output — the installed ps2ip-nm.irx is byte-identical before/after. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…aults Remove eight defines from iop/tcpip-base/lwipopts.h that either exactly match lwIP 2.0.3's own defaults in opt.h, or configure dead subsystems that aren't compiled in this build: - NO_SYS=0 (default 0) - MEM_LIBC_MALLOC=0 (default 0) - MEMP_NUM_TCPIP_MSG_API=8 (default 8) - SYS_LIGHTWEIGHT_PROT=1 (default 1) - SLIPIF_THREAD_STACKSIZE/PRIO (slipif.c not compiled) - PPP_THREAD_STACKSIZE/PRIO (symbols not referenced anywhere in lwIP 2.0.3) Also drop the no-op #ifndef LWIP_TCPIP_CORE_LOCKING_INPUT guard around MEMP_NUM_TCPIP_MSG_INPKT; the symbol is always set later in this same file so the guard never activates. Proof of no-op: installed ps2ip-nm.irx SHA is unchanged (c298b3cc2be09636c68ddd2ff077a62f2b2aa03c) before and after. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…defaults Symmetric with the earlier iop lwipopts.h prune. Remove seven defines from ee/network/tcpip/src/include/lwipopts.h that either match lwIP 2.0.3's defaults in opt.h or configure dead subsystems, and drop the no-op #ifndef guard: - NO_SYS=0 (default 0) - SYS_LIGHTWEIGHT_PROT=1 (default 1) - SLIPIF_THREAD_STACKSIZE/PRIO (slipif.c not compiled) - PPP_THREAD_STACKSIZE/PRIO (symbols not referenced anywhere in lwIP 2.0.3) - #ifndef LWIP_TCPIP_CORE_LOCKING_INPUT guard around MEMP_NUM_TCPIP_MSG_INPKT (LWIP_TCPIP_CORE_LOCKING_INPUT is defined later in the same file, so the guard never activates) Keep MEM_LIBC_MALLOC=1 (EE deliberately overrides the default 0 to reuse newlib's malloc) and MEMP_NUM_TCPIP_MSG_API=50 (differs from default 8). Proof of no-op: 31/35 object files are byte-identical; the 4 that differ (ps2ip.o, sys_arch.o, ps2ip_ps2sdk.o, erl-support.o) only diverge in LTO-specific sections (.gnu.lto_*, .gnu.debuglto_*), while their .text, .data and .rodata are byte-identical. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…arch/cc.h Upstream lwIP's sockets.h uses struct timeval without including <sys/time.h> when LWIP_TIMEVAL_PRIVATE=0. The scaffolding commit worked around this by adding a -include sys/time.h flag to the EE and IOP tcpip Makefiles and a redundant #include in ps2ip_internal.h. A cleaner home for it is the lwIP port header arch/cc.h, which lwIP's opt.h (and hence every translation unit) already pulls in. The -include cmdline hack becomes unnecessary, and the workaround is self-documented where future maintainers will look. - ee/network/tcpip/src/include/arch/cc.h: add #include <sys/time.h> - iop/tcpip/tcpip-base/include/arch/cc.h: add #include <sys/time.h> - ee/network/tcpip/Makefile: drop -include sys/time.h - iop/tcpip/tcpip/Makefile: drop -include sys/time.h - iop/tcpip/tcpip-base/include/ps2ip_internal.h: drop redundant #include <sys/time.h> (now provided via lwip/opt.h → arch/cc.h) Proof of no-op: - iop ps2ip-nm.irx SHA unchanged (c298b3cc2be09636c68ddd2ff077a62f2b2aa03c) - EE .text/.data/.rodata byte-identical across 11 verified files (tcp, tcp_in, tcp_out, udp, ip, sockets, api_lib, api_msg, ps2ip, sys_arch, ps2ip_ps2sdk) - Obj-level hash diffs are LTO/DWARF metadata only (preprocessor state captured in debug info; generated code is identical). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Is it really a good idea, to work around unaligned memory accesses in SMAP? I recall that LWIP had some options to deal with alignment limitations of some hosts, but it has been a very long time since I touched this, indeed. Copying with the IOP is very slow. |
|
Re: LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT I may be wrong if things have changed, but we are not always using a mutex because semaphores are very slow on the IOP, compared to just disabling interrupts. Signalling a semaphore from an interrupt context is possible with iSignalSema(). |
Good question.... I didn't think about that. #include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <fcntl.h>
#include <stdint.h>
#include <errno.h>
#include <string.h>
#define BUF_SIZE 32768
void SleepThread(void);
static double time_diff(const struct timeval *start, const struct timeval *end) {
return (end->tv_sec - start->tv_sec) + (end->tv_usec - start->tv_usec) / 1e6;
}
static void run_read_mode(const char *path) {
int fd = open(path, O_RDONLY);
if (fd < 0) {
fprintf(stderr, "open %s: %s\n", path, strerror(errno));
return;
}
char *buf = malloc(BUF_SIZE);
if (!buf) {
fprintf(stderr, "malloc failed\n");
close(fd);
return;
}
ssize_t r;
uint64_t total = 0;
struct timeval start, now, last_print;
gettimeofday(&start, NULL);
last_print = start;
uint64_t interval_bytes = 0;
printf("-- READ mode (read syscall) --\n");
while ((r = read(fd, buf, BUF_SIZE)) > 0) {
total += (uint64_t)r;
interval_bytes += (uint64_t)r;
gettimeofday(&now, NULL);
double since_last = time_diff(&last_print, &now);
if (since_last >= 1.0) {
double instant_mib = (interval_bytes / (1024.0 * 1024.0)) / since_last;
double elapsed = time_diff(&start, &now);
double avg_mib = (total / (1024.0 * 1024.0)) / (elapsed > 0 ? elapsed : 1e-9);
printf("Elapsed: %.2fs Instant: %.3f MiB/s Avg: %.3f MiB/s Read: %lu bytes\n",
elapsed, instant_mib, avg_mib, (unsigned long)total);
fflush(stdout);
last_print = now;
interval_bytes = 0;
}
}
gettimeofday(&now, NULL);
double total_secs = time_diff(&start, &now);
if (r < 0) {
fprintf(stderr, "read error: %s\n", strerror(errno));
}
printf("READ Done. Read %lu bytes in %.6fs (%.3f MiB/s)\n\n",
(unsigned long)total, total_secs, (total / (1024.0 * 1024.0)) / (total_secs > 0 ? total_secs : 1e-9));
free(buf);
close(fd);
}
static void run_fread_mode(const char *path) {
FILE *f = fopen(path, "rb");
if (!f) {
fprintf(stderr, "fopen %s failed: %s\n", path, strerror(errno));
return;
}
char *buf = malloc(BUF_SIZE);
if (!buf) {
fprintf(stderr, "malloc failed\n");
fclose(f);
return;
}
size_t r;
uint64_t total = 0;
struct timeval start, now, last_print;
gettimeofday(&start, NULL);
last_print = start;
uint64_t interval_bytes = 0;
printf("-- FREAD mode (stdio fread) --\n");
while ((r = fread(buf, 1, BUF_SIZE, f)) > 0) {
total += (uint64_t)r;
interval_bytes += (uint64_t)r;
gettimeofday(&now, NULL);
double since_last = time_diff(&last_print, &now);
if (since_last >= 1.0) {
double instant_mib = (interval_bytes / (1024.0 * 1024.0)) / since_last;
double elapsed = time_diff(&start, &now);
double avg_mib = (total / (1024.0 * 1024.0)) / (elapsed > 0 ? elapsed : 1e-9);
printf("Elapsed: %.2fs Instant: %.3f MiB/s Avg: %.3f MiB/s Read: %lu bytes\n",
elapsed, instant_mib, avg_mib, (unsigned long)total);
fflush(stdout);
last_print = now;
interval_bytes = 0;
}
}
gettimeofday(&now, NULL);
double total_secs = time_diff(&start, &now);
if (ferror(f)) {
fprintf(stderr, "fread error\n");
}
printf("FREAD Done. Read %lu bytes in %.6fs (%.3f MiB/s)\n\n",
(unsigned long)total, total_secs, (total / (1024.0 * 1024.0)) / (total_secs > 0 ? total_secs : 1e-9));
free(buf);
fclose(f);
}
int main(int argc, char **argv) {
const char *path = "testfile.bin";
const char *mode = "both"; // read, fread, both
if (argc > 1) path = argv[1];
if (argc > 2) mode = argv[2];
if (strcmp(mode, "read") == 0) {
run_read_mode(path);
} else if (strcmp(mode, "fread") == 0) {
run_fread_mode(path);
} else {
run_read_mode(path);
run_fread_mode(path);
}
SleepThread(); // Prevent exit to see final output on PS2
return 0;
} |
|
I was referring to the comment you made in the code. There are various good reasons to enable this setting. If you revert enough settings (including turning off core locking), you should eventually end up with an LWIP version that performs incredibly poorly on the IOP. Personally, I feel that we need to strike a balance between using the stock code and providing a custom version. This isn't a modern game console, so the code not fitting its architecture can be a very bad thing. Re: the PBUF alignment problem that I commented on above. I wrote a comment about how it was fixed, a long time ago. Without this change, the memory buffer will be unaligned. Looking at that changelog, you should be careful because some changes came about due to incompatibilities with how the SCE IOP kernel works. |
It avoids some difficult-to-diagnose issues such as hanging. Speed can come later.
|
|
Thanks for the careful review @sp193 — let me try to address both concerns, since I think we're actually aligned on what we want.
@uyjulian is right — when this option is =1, lwIP's So enabling this flag is exactly the "disable interrupts instead of taking a semaphore" path you're describing. If the in-code comment reads like we're introducing a mutex, I'm happy to reword it to make the direction explicit (e.g., "uses
Your historical pbuf-alignment comment is actually the reason I needed this commit. The ps2dev fork has a patch to I agree pure IOP copy is slow, but:
My goal in this PR is exactly to stop patching the lwIP source so future bumps stay easy.
The settings I pruned in the Happy to revisit any specific commit if you'd like — and to reword the lwipopts comment to be less ambiguous. |
Summary
Switch the lwIP dependency from the long-standing ps2dev/lwip fork
(branch
ps2-v2.0.3) to upstreamlwip-tcpip/lwipSTABLE-2_0_3_RELEASE,and absorb the small set of accommodations into ps2sdk so the lwIP source tree
stays unmodified going forward.
This is the foundation for moving to newer lwIP releases — it removes the fork
as a blocker. A follow-up stacked PR upgrades to lwIP 2.2.1 on top of this.
Why
The ps2dev fork carried 5 PS2-specific patches against lwIP 2.0.3 source. Most
of them masked workarounds we can express through
lwipopts.h/glue code, andsome are no longer needed at all in current code paths. Owning patches against
upstream lwIP makes every future version bump painful (rebases, conflict
resolution against a moving target). Switching to clean upstream removes that
maintenance tax.
What changes
Switch to upstream lwIP 2.0.3:
download_dependencies.sh: clonelwip-tcpip/lwipatSTABLE-2_0_3_RELEASEinstead of
ps2dev/lwipatps2-v2.0.3.ps2sdk-side accommodations replacing the fork's patches:
iop/tcpip/tcpip-base/include/arch/cc.h: pre-include<sys/time.h>.Upstream
sockets.hreferencesstruct timevalwithout including it whenLWIP_TIMEVAL_PRIVATE=0.iop/tcpip/tcpip-base/include/stdlib.h: minimal IOP shim that definesatoivia sysclib'sstrtol. Upstream lwIP expectsatoito be declared.iop/tcpip/tcpip-base/include/lwipopts.h: setLWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT=1somem_free()usesSYS_ARCH_PROTECTrather than a mutex. Required because SMAP RX callspbuf_free()from interrupt-disabled context — same effect as the fork'smem.cpatch but achieved purely via lwipopts.Driver-side fixes uncovered by the move:
iop/network/smap: handle 2-byte-aligned PBUF_RAM payloads inCopyToFIFO/CopyFromFIFOusinglwl/lwr(R3000A has no unalignedword access — pbufs landing on a 2-byte boundary tripped AdEL on real
hardware).
iop/network/smap/src/main.c: prune deadPRE_LWIP_130_COMPATbranches.iop/tcpip/tcpip-netman: drop deadPRE_LWIP_130_COMPAT, advertiseNETIF_FLAG_ETHERNETon the netif.Configuration tightening for the IOP target:
IP_REASSEMBLYandIP_FRAG: PS2 networking targets a local LANwith MTU=1500 and TCP negotiates MSS=1460, so fragmentation never arises in
the common case. Frees
MEMP_NUM_REASSDATA/MEMP_NUM_FRAG_PBUFpoolentries.
lwipopts.hentries that just restate lwIP 2.0.3 defaults(separate cleanups for the IOP and EE variants).
ps2api/ps2ipMakefile lists.sys/time.hinclude from the Makefile's-includetoarch/cc.hsoit's visible to every consumer of
arch/cc.h, not just lwIP source.Compatibility
ps2ip-nm.irxandps2ip.irxwith the same exporttable; downstream IRXs (smap, netman) rebuild without changes to their
imports.
ps2linkand other consumers continue to work without modification.Test plan
tcpip-netman→smap→netman,install IRXs, build
ps2link.ps2link execeeandresetsucceedthe same as with the fork's 2.0.3.
Follow-up
A stacked PR upgrades on top of this branch to lwIP 2.2.1, adding only what's
required (the load-bearing piece is a one-character fix in
common/include/errno.h— thesection("data")attribute was creating asection the IRX loader does not allocate, latent for years until upstream
removed
LWIP_SOCKET_SET_ERRNO).