Skip to content

Lwip mainstream#826

Open
fjtrujy wants to merge 9 commits intops2dev:masterfrom
fjtrujy:lwip_mainstream
Open

Lwip mainstream#826
fjtrujy wants to merge 9 commits intops2dev:masterfrom
fjtrujy:lwip_mainstream

Conversation

@fjtrujy
Copy link
Copy Markdown
Member

@fjtrujy fjtrujy commented Apr 25, 2026

Summary

Switch the lwIP dependency from the long-standing ps2dev/lwip fork
(branch ps2-v2.0.3) to upstream lwip-tcpip/lwip STABLE-2_0_3_RELEASE,
and absorb the small set of accommodations into ps2sdk so the lwIP source tree
stays unmodified going forward.

This is the foundation for moving to newer lwIP releases — it removes the fork
as a blocker. A follow-up stacked PR upgrades to lwIP 2.2.1 on top of this.

Why

The ps2dev fork carried 5 PS2-specific patches against lwIP 2.0.3 source. Most
of them masked workarounds we can express through lwipopts.h/glue code, and
some are no longer needed at all in current code paths. Owning patches against
upstream lwIP makes every future version bump painful (rebases, conflict
resolution against a moving target). Switching to clean upstream removes that
maintenance tax.

What changes

Switch to upstream lwIP 2.0.3:

  • download_dependencies.sh: clone lwip-tcpip/lwip at STABLE-2_0_3_RELEASE
    instead of ps2dev/lwip at ps2-v2.0.3.

ps2sdk-side accommodations replacing the fork's patches:

  • iop/tcpip/tcpip-base/include/arch/cc.h: pre-include <sys/time.h>.
    Upstream sockets.h references struct timeval without including it when
    LWIP_TIMEVAL_PRIVATE=0.
  • iop/tcpip/tcpip-base/include/stdlib.h: minimal IOP shim that defines
    atoi via sysclib's strtol. Upstream lwIP expects atoi to be declared.
  • iop/tcpip/tcpip-base/include/lwipopts.h: set
    LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT=1 so mem_free() uses
    SYS_ARCH_PROTECT rather than a mutex. Required because SMAP RX calls
    pbuf_free() from interrupt-disabled context — same effect as the fork's
    mem.c patch but achieved purely via lwipopts.

Driver-side fixes uncovered by the move:

  • iop/network/smap: handle 2-byte-aligned PBUF_RAM payloads in
    CopyToFIFO / CopyFromFIFO using lwl/lwr (R3000A has no unaligned
    word access — pbufs landing on a 2-byte boundary tripped AdEL on real
    hardware).
  • iop/network/smap/src/main.c: prune dead PRE_LWIP_130_COMPAT branches.
  • iop/tcpip/tcpip-netman: drop dead PRE_LWIP_130_COMPAT, advertise
    NETIF_FLAG_ETHERNET on the netif.

Configuration tightening for the IOP target:

  • Disable IP_REASSEMBLY and IP_FRAG: PS2 networking targets a local LAN
    with MTU=1500 and TCP negotiates MSS=1460, so fragmentation never arises in
    the common case. Frees MEMP_NUM_REASSDATA / MEMP_NUM_FRAG_PBUF pool
    entries.
  • Prune lwipopts.h entries that just restate lwIP 2.0.3 defaults
    (separate cleanups for the IOP and EE variants).
  • Style: one-object-per-line formatting in ps2api/ps2ip Makefile lists.
  • Move sys/time.h include from the Makefile's -include to arch/cc.h so
    it's visible to every consumer of arch/cc.h, not just lwIP source.

Compatibility

  • The build still produces ps2ip-nm.irx and ps2ip.irx with the same export
    table; downstream IRXs (smap, netman) rebuild without changes to their
    imports.
  • ps2link and other consumers continue to work without modification.

Test plan

  • Build the full IOP networking chain: tcpip-netmansmapnetman,
    install IRXs, build ps2link.
  • Real hardware: confirmed working — ps2link execee and reset succeed
    the same as with the fork's 2.0.3.

Follow-up

A stacked PR upgrades on top of this branch to lwIP 2.2.1, adding only what's
required (the load-bearing piece is a one-character fix in
common/include/errno.h — the section("data") attribute was creating a
section the IRX loader does not allocate, latent for years until upstream
removed LWIP_SOCKET_SET_ERRNO).

fjtrujy and others added 9 commits April 24, 2026 15:29
Switch the lwIP dependency from the ps2dev fork (ps2-v2.0.3 branch)
to the upstream lwip-tcpip/lwip repo at STABLE-2_0_3_RELEASE, and
carry the needed accommodations in our own code so the lwIP source
tree stays unmodified:

- Pre-include <sys/time.h> in ee/iop tcpip Makefiles and in
  ps2ip_internal.h. Upstream lwIP's sockets.h uses struct timeval
  without including sys/time.h when LWIP_TIMEVAL_PRIVATE=0.
- Provide a minimal IOP stdlib.h that defines atoi via sysclib's
  strtol. Upstream lwIP expects atoi to be declared.
- Set LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT=1 in iop lwipopts.h so
  mem_free() uses SYS_ARCH_PROTECT rather than a mutex. This
  replaces the ps2dev fork's mem.c patch and is required because
  SMAP RX calls pbuf_free() in interrupt-disabled context.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CopyToFIFO and CopyFromFIFO did ((u32 *)buffer)[i / 4], which
assumes a 4-byte aligned buffer. On real hardware this trapped an
IOP AdEL (address error on load) exception when lwIP handed SMAP
a pbuf whose payload was only halfword-aligned, observed on the
PIO fallback path where length < 64 so SmapDmaTransfer returns 0
and the entire buffer is copied through PIO. Reproducible crash
signature: Module "SMAP_driver" Relative EPC 0x0000227C,
Cause 0x30000010 (ExcCode=4 AdEL), BadVAddr ending in 2/6/A/E.

Detect unaligned buffers and take a byte-wise word-assembly path;
GCC recognises the pattern and emits lwl/lwr. Also skip the SPD
DMA path for unaligned buffers since the DMA controller likewise
requires 4-byte alignment on source/destination.

The aligned fast path is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Upstream lwIP has been >= 1.3.0 for well over a decade and lwIP
2.0.3 is the minimum target on this branch. Drop the unreachable
SMapOutput() wrapper and its associated ifdef forks in
SMapIFInit; smap-ps2ip now unconditionally uses etharp_output and
NETIF_FLAG_ETHARP. No functional change: builds only the
BUILDING_SMAP_PS2IP variant; the netman/netdev smap.irx binaries
are byte-identical (shasum unchanged).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PS2 networking targets a local LAN with MTU=1500, and TCP
negotiates MSS=1460 so IP fragmentation never arises in the
common case. Turn off IP_REASSEMBLY and IP_FRAG to drop the
corresponding lwIP code paths and free their static pool slots
(MEMP_NUM_REASSDATA / MEMP_NUM_FRAG_PBUF).

Savings observed in ps2ip-nm.irx: 94805 -> 91509 bytes (-3.3 KB
of code), plus runtime pool memory that no longer needs to be
reserved. ps2link end-to-end test (execee over TCP) confirmed
unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…F_FLAG_ETHERNET

Symmetric with the earlier smap/main.c cleanup: lwIP 2.0.3 is
well past 1.3.0, so the SMapOutput() wrapper and ifdef forks in
SMapIFInit are dead code. Replace the conditional flag block with
a single line, and add NETIF_FLAG_ETHERNET to correctly mark the
netif as Ethernet — lwIP 2.0.3 defines this flag in netif.h:95
and certain code paths (e.g. when LWIP_ARP is disabled) branch
on it.

ps2link end-to-end test (execee over TCP) confirmed unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reflow the ps2api_OBJECTS, ps2api_IPV4 and ps2ip_OBJECTS lists
in both ee/network/tcpip/Makefile and iop/tcpip/tcpip/Makefile
onto one object per line. Makes future additions (and diffs)
easier to review without changing any compile output — the
installed ps2ip-nm.irx is byte-identical before/after.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…aults

Remove eight defines from iop/tcpip-base/lwipopts.h that either
exactly match lwIP 2.0.3's own defaults in opt.h, or configure
dead subsystems that aren't compiled in this build:

- NO_SYS=0 (default 0)
- MEM_LIBC_MALLOC=0 (default 0)
- MEMP_NUM_TCPIP_MSG_API=8 (default 8)
- SYS_LIGHTWEIGHT_PROT=1 (default 1)
- SLIPIF_THREAD_STACKSIZE/PRIO (slipif.c not compiled)
- PPP_THREAD_STACKSIZE/PRIO (symbols not referenced anywhere in
  lwIP 2.0.3)

Also drop the no-op #ifndef LWIP_TCPIP_CORE_LOCKING_INPUT guard
around MEMP_NUM_TCPIP_MSG_INPKT; the symbol is always set later
in this same file so the guard never activates.

Proof of no-op: installed ps2ip-nm.irx SHA is unchanged
(c298b3cc2be09636c68ddd2ff077a62f2b2aa03c) before and after.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…defaults

Symmetric with the earlier iop lwipopts.h prune. Remove seven
defines from ee/network/tcpip/src/include/lwipopts.h that either
match lwIP 2.0.3's defaults in opt.h or configure dead
subsystems, and drop the no-op #ifndef guard:

- NO_SYS=0 (default 0)
- SYS_LIGHTWEIGHT_PROT=1 (default 1)
- SLIPIF_THREAD_STACKSIZE/PRIO (slipif.c not compiled)
- PPP_THREAD_STACKSIZE/PRIO (symbols not referenced anywhere in
  lwIP 2.0.3)
- #ifndef LWIP_TCPIP_CORE_LOCKING_INPUT guard around
  MEMP_NUM_TCPIP_MSG_INPKT (LWIP_TCPIP_CORE_LOCKING_INPUT is
  defined later in the same file, so the guard never activates)

Keep MEM_LIBC_MALLOC=1 (EE deliberately overrides the default 0
to reuse newlib's malloc) and MEMP_NUM_TCPIP_MSG_API=50 (differs
from default 8).

Proof of no-op: 31/35 object files are byte-identical; the 4 that
differ (ps2ip.o, sys_arch.o, ps2ip_ps2sdk.o, erl-support.o) only
diverge in LTO-specific sections (.gnu.lto_*, .gnu.debuglto_*),
while their .text, .data and .rodata are byte-identical.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…arch/cc.h

Upstream lwIP's sockets.h uses struct timeval without including
<sys/time.h> when LWIP_TIMEVAL_PRIVATE=0. The scaffolding commit
worked around this by adding a -include sys/time.h flag to the
EE and IOP tcpip Makefiles and a redundant #include in
ps2ip_internal.h. A cleaner home for it is the lwIP port header
arch/cc.h, which lwIP's opt.h (and hence every translation unit)
already pulls in. The -include cmdline hack becomes unnecessary,
and the workaround is self-documented where future maintainers
will look.

- ee/network/tcpip/src/include/arch/cc.h: add #include <sys/time.h>
- iop/tcpip/tcpip-base/include/arch/cc.h: add #include <sys/time.h>
- ee/network/tcpip/Makefile: drop -include sys/time.h
- iop/tcpip/tcpip/Makefile: drop -include sys/time.h
- iop/tcpip/tcpip-base/include/ps2ip_internal.h: drop redundant
  #include <sys/time.h> (now provided via lwip/opt.h → arch/cc.h)

Proof of no-op:
- iop ps2ip-nm.irx SHA unchanged (c298b3cc2be09636c68ddd2ff077a62f2b2aa03c)
- EE .text/.data/.rodata byte-identical across 11 verified files
  (tcp, tcp_in, tcp_out, udp, ip, sockets, api_lib, api_msg,
  ps2ip, sys_arch, ps2ip_ps2sdk)
- Obj-level hash diffs are LTO/DWARF metadata only (preprocessor
  state captured in debug info; generated code is identical).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sp193
Copy link
Copy Markdown
Member

sp193 commented Apr 25, 2026

Is it really a good idea, to work around unaligned memory accesses in SMAP? I recall that LWIP had some options to deal with alignment limitations of some hosts, but it has been a very long time since I touched this, indeed.

Copying with the IOP is very slow.

@sp193
Copy link
Copy Markdown
Member

sp193 commented Apr 25, 2026

Re: LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT

I may be wrong if things have changed, but we are not always using a mutex because semaphores are very slow on the IOP, compared to just disabling interrupts. Signalling a semaphore from an interrupt context is possible with iSignalSema().

@fjtrujy
Copy link
Copy Markdown
Member Author

fjtrujy commented Apr 25, 2026

Re: LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT

I may be wrong if things have changed, but we are not always using a mutex because semaphores are very slow on the IOP, compared to just disabling interrupts. Signalling a semaphore from an interrupt context is possible with iSignalSema().

Good question.... I didn't think about that.
I was just looking to use mainstream changes to make future upgrades easier....
Anyway, I didn't notice any slowdown, but it is true that the only test that I did was using ps2link & a transfer_speed test

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/time.h>
#include <fcntl.h>
#include <stdint.h>
#include <errno.h>
#include <string.h>

#define BUF_SIZE 32768

void SleepThread(void);

static double time_diff(const struct timeval *start, const struct timeval *end) {
    return (end->tv_sec - start->tv_sec) + (end->tv_usec - start->tv_usec) / 1e6;
}

static void run_read_mode(const char *path) {
    int fd = open(path, O_RDONLY);
    if (fd < 0) {
        fprintf(stderr, "open %s: %s\n", path, strerror(errno));
        return;
    }

    char *buf = malloc(BUF_SIZE);
    if (!buf) {
        fprintf(stderr, "malloc failed\n");
        close(fd);
        return;
    }

    ssize_t r;
    uint64_t total = 0;
    struct timeval start, now, last_print;
    gettimeofday(&start, NULL);
    last_print = start;
    uint64_t interval_bytes = 0;

    printf("-- READ mode (read syscall) --\n");
    while ((r = read(fd, buf, BUF_SIZE)) > 0) {
        total += (uint64_t)r;
        interval_bytes += (uint64_t)r;
        gettimeofday(&now, NULL);
        double since_last = time_diff(&last_print, &now);
        if (since_last >= 1.0) {
            double instant_mib = (interval_bytes / (1024.0 * 1024.0)) / since_last;
            double elapsed = time_diff(&start, &now);
            double avg_mib = (total / (1024.0 * 1024.0)) / (elapsed > 0 ? elapsed : 1e-9);
            printf("Elapsed: %.2fs  Instant: %.3f MiB/s  Avg: %.3f MiB/s  Read: %lu bytes\n",
                   elapsed, instant_mib, avg_mib, (unsigned long)total);
            fflush(stdout);
            last_print = now;
            interval_bytes = 0;
        }
    }

    gettimeofday(&now, NULL);
    double total_secs = time_diff(&start, &now);
    if (r < 0) {
        fprintf(stderr, "read error: %s\n", strerror(errno));
    }
    printf("READ Done. Read %lu bytes in %.6fs (%.3f MiB/s)\n\n",
           (unsigned long)total, total_secs, (total / (1024.0 * 1024.0)) / (total_secs > 0 ? total_secs : 1e-9));

    free(buf);
    close(fd);
}

static void run_fread_mode(const char *path) {
    FILE *f = fopen(path, "rb");
    if (!f) {
        fprintf(stderr, "fopen %s failed: %s\n", path, strerror(errno));
        return;
    }

    char *buf = malloc(BUF_SIZE);
    if (!buf) {
        fprintf(stderr, "malloc failed\n");
        fclose(f);
        return;
    }

    size_t r;
    uint64_t total = 0;
    struct timeval start, now, last_print;
    gettimeofday(&start, NULL);
    last_print = start;
    uint64_t interval_bytes = 0;

    printf("-- FREAD mode (stdio fread) --\n");
    while ((r = fread(buf, 1, BUF_SIZE, f)) > 0) {
        total += (uint64_t)r;
        interval_bytes += (uint64_t)r;
        gettimeofday(&now, NULL);
        double since_last = time_diff(&last_print, &now);
        if (since_last >= 1.0) {
            double instant_mib = (interval_bytes / (1024.0 * 1024.0)) / since_last;
            double elapsed = time_diff(&start, &now);
            double avg_mib = (total / (1024.0 * 1024.0)) / (elapsed > 0 ? elapsed : 1e-9);
            printf("Elapsed: %.2fs  Instant: %.3f MiB/s  Avg: %.3f MiB/s  Read: %lu bytes\n",
                   elapsed, instant_mib, avg_mib, (unsigned long)total);
            fflush(stdout);
            last_print = now;
            interval_bytes = 0;
        }
    }

    gettimeofday(&now, NULL);
    double total_secs = time_diff(&start, &now);
    if (ferror(f)) {
        fprintf(stderr, "fread error\n");
    }
    printf("FREAD Done. Read %lu bytes in %.6fs (%.3f MiB/s)\n\n",
           (unsigned long)total, total_secs, (total / (1024.0 * 1024.0)) / (total_secs > 0 ? total_secs : 1e-9));

    free(buf);
    fclose(f);
}

int main(int argc, char **argv) {
    const char *path = "testfile.bin";
    const char *mode = "both"; // read, fread, both
    if (argc > 1) path = argv[1];
    if (argc > 2) mode = argv[2];

    if (strcmp(mode, "read") == 0) {
        run_read_mode(path);
    } else if (strcmp(mode, "fread") == 0) {
        run_fread_mode(path);
    } else {
        run_read_mode(path);
        run_fread_mode(path);
    }

    SleepThread(); // Prevent exit to see final output on PS2
    return 0;
}

@sp193
Copy link
Copy Markdown
Member

sp193 commented Apr 25, 2026

I was referring to the comment you made in the code. There are various good reasons to enable this setting.

If you revert enough settings (including turning off core locking), you should eventually end up with an LWIP version that performs incredibly poorly on the IOP.

Personally, I feel that we need to strike a balance between using the stock code and providing a custom version. This isn't a modern game console, so the code not fitting its architecture can be a very bad thing.

Re: the PBUF alignment problem that I commented on above. I wrote a comment about how it was fixed, a long time ago. Without this change, the memory buffer will be unaligned.

pbuf_alloc() will no longer align the payload field, for PBUF_RAM allocations.
	This is so that frames being output will not be unaligned.

	Note: this still doesn't solve all cases. For example, generating a UDP packet will result in an unaligned final address,
	as PBUF_TRANSPORT is a general "one size fits all" option.

Looking at that changelog, you should be careful because some changes came about due to incompatibilities with how the SCE IOP kernel works.

@uyjulian
Copy link
Copy Markdown
Member

Is it really a good idea, to work around unaligned memory accesses in SMAP? I recall that LWIP had some options to deal with alignment limitations of some hosts, but it has been a very long time since I touched this, indeed.

Copying with the IOP is very slow.

It avoids some difficult-to-diagnose issues such as hanging. Speed can come later.
(I don't see a major difference in the speed in this PR, measured with iperf, anyways)

Re: LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT

I may be wrong if things have changed, but we are not always using a mutex because semaphores are very slow on the IOP, compared to just disabling interrupts. Signalling a semaphore from an interrupt context is possible with iSignalSema().

LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT enabled (as in this PR) does not use a mutex and instead disables/enables interrupts.

@fjtrujy
Copy link
Copy Markdown
Member Author

fjtrujy commented Apr 25, 2026

Thanks for the careful review @sp193 — let me try to address both concerns, since I think we're actually aligned on what we want.

Re: LWIP_ALLOW_MEM_FREE_FROM_OTHER_CONTEXT and the comment in code

@uyjulian is right — when this option is =1, lwIP's mem_free() uses SYS_ARCH_PROTECT(lev) for the critical section, not a mutex. And on our port, SYS_ARCH_PROTECT resolves to CpuSuspendIntr() — see iop/tcpip/tcpip-base/sys_arch.c:355:

sys_prot_t sys_arch_protect(void)
{
    sys_prot_t Flags;
    CpuSuspendIntr(&Flags);
    return Flags;
}

So enabling this flag is exactly the "disable interrupts instead of taking a semaphore" path you're describing.
With the flag=0 (the default), lwIP would wrap mem_free() in a sys_mutex_t, which on the IOP means a semaphore — that's the slow path we want to avoid.

If the in-code comment reads like we're introducing a mutex, I'm happy to reword it to make the direction explicit (e.g., "uses SYS_ARCH_PROTECT (interrupt disable on IOP) rather than wrapping the critical section in a sys_mutex_t/semaphore"). Let me know if that helps.

Re: handling unaligned buffers in CopyToFIFO/CopyFromFIFO

Your historical pbuf-alignment comment is actually the reason I needed this commit. The ps2dev fork has a patch to pbuf_alloc() that adjusts payload alignment so that PBUF_RAM allocations land in a way that keeps the on-wire frame aligned. By moving to stock lwIP 2.0.3, that patch is gone — so there is now a real path where the PIO copy hits a halfword-aligned payload and the IOP traps an AdEL exception (reproducible: ExcCode=4, BadVAddr ending in 2/6/A/E, on the PIO fallback when len<64).

I agree pure IOP copy is slow, but:

  • The aligned fast path is unchanged, so the common case has zero added cost.
  • GCC turns the byte-wise word assembly into lwl/lwr, so even the slow path is reasonably tight (no per-byte function call overhead).
  • @uyjulian measured with iperf and didn't see a meaningful delta.

My goal in this PR is exactly to stop patching the lwIP source so future bumps stay easy.

Re: balance between stock and custom
Totally agree — that's the line I'm trying to walk here. The places where the IOP architecture genuinely conflicts with stock lwIP interrupt-context pbuf_free, alignment, no <sys/time.h> from sockets.h) are kept as accommodations outside the lwIP tree.

The settings I pruned in the lwipopts cleanup commits are only the ones that already match upstream defaults or configure dead code — I verified the installed ps2ip-nm.irx is byte-identical (sha unchanged) before/after each cleanup commit, so nothing performance-critical was inadvertently flipped.

Happy to revisit any specific commit if you'd like — and to reword the lwipopts comment to be less ambiguous.

Copy link
Copy Markdown
Member

@uyjulian uyjulian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aging tested with TCP iperf for 12 hours, works fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants