Skip to content

PCA fails sometimes (maybe if nodes are not 0...n?) #3443

@petrelharp

Description

@petrelharp

Consider the following code:

ts = msprime.sim_ancestry(3)
ts = msprime.sim_mutations(ts, rate=1)
t = ts.dump_tables()
t.nodes.clear()
for n in ts.nodes():
    t.nodes.append(n.replace(flags=1 if (n.id > 2 and n.id < 6) else 0))

ts.pca(num_components=2)
Traceback (most recent call last):
  File "<python-input-10>", line 1, in <module>
    ts.pca(num_components=2)
    ~~~~~~^^^^^^^^^^^^^^^^^^
  File "/home/peter/projects/tskit-dev/tskit/python/tskit/trees.py", line 9537, in pca
    U[i], D[i], Q[i], E[i] = _rand_svd(
                             ~~~~~~~~~^
        operator=_G,
        ^^^^^^^^^^^^
    ...<4 lines>...
        range_sketch=range_sketch[i],
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/peter/projects/tskit-dev/tskit/python/tskit/trees.py", line 9479, in _rand_svd
    Q = _rand_pow_range_finder(
        operator,
    ...<4 lines>...
        Q=range_sketch,
    )
  File "/home/peter/projects/tskit-dev/tskit/python/tskit/trees.py", line 9463, in _rand_pow_range_finder
    Q = operator(Q)
  File "/home/peter/projects/tskit-dev/tskit/python/tskit/trees.py", line 9518, in _G
    high = _f_high(
        arr=x,
    ...<3 lines>...
        windows=windows[i : i + 2],
    )
  File "/home/peter/projects/tskit-dev/tskit/python/tskit/trees.py", line 9229, in _genetic_relatedness_vector_node
    x = self._expand_indices(x, indices)
  File "/home/peter/projects/tskit-dev/tskit/python/tskit/trees.py", line 9216, in _expand_indices
    y[indices] = x
    ~^^^^^^^^^
IndexError: index 3 is out of bounds for axis 0 with size 3

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions