Package org.apache.sysds.runtime.data
Class SparseBlockCSR
- java.lang.Object
-
- org.apache.sysds.runtime.data.SparseBlock
-
- org.apache.sysds.runtime.data.SparseBlockCSR
-
- All Implemented Interfaces:
Serializable
public class SparseBlockCSR extends SparseBlock
SparseBlock implementation that realizes a traditional 'compressed sparse row' representation, where the entire sparse block is stored as three arrays: ptr of length rlen+1 to store offsets per row, and indexes/values of length nnz to store column indexes and values of non-zero entries. This format is very memory efficient for sparse (but not ultra-sparse) matrices and provides very good performance for common operations, partially due to lower memory bandwidth requirements. However, this format is slow on incremental construction (because it does not allow append/sort per row) without reshifting. Finally, the total nnz is limited to INTEGER_MAX, whereas for SparseBlockMCSR only the nnz per row are limited to INTEGER_MAX. TODO: extensions for faster incremental construction (e.g., max row) TODO more efficient fused setIndexRange impl to avoid repeated copies and updates- See Also:
- Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.sysds.runtime.data.SparseBlock
SparseBlock.Type
-
-
Constructor Summary
Constructors Constructor Description SparseBlockCSR(int rlen)SparseBlockCSR(int[] rowPtr, int[] colInd, double[] values, int nnz)SparseBlockCSR(int rlen, int capacity)SparseBlockCSR(int rows, int[] rowInd, int[] colInd, double[] values)Copy constructor for COO representationSparseBlockCSR(int rlen, int capacity, int size)SparseBlockCSR(int rows, int nnz, int[] colInd)Copy constructor for given array of column indexes, which identifies rows by position and implies values of 1.SparseBlockCSR(SparseBlock sblock)Copy constructor sparse block abstraction.SparseBlockCSR(SparseRow[] rows, int nnz)Copy constructor old sparse row representation.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanadd(int r, int c, double v)Add a value to a matrix cell (r,c).voidallocate(int r)Allocate the underlying data structure holding non-zero values of row r if necessary.voidallocate(int r, int nnz)Allocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.voidallocate(int r, int ennz, int maxnnz)Allocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.voidappend(int r, int c, double v)Append a value to the end of the physical representation.booleancheckValidity(int rlen, int clen, long nnz, boolean strict)Validate the correctness of the internal data structures of the different sparse block implementations.voidcompact()voidcompact(int r)Re-allocate physical row if physical size exceeds logical size plus resize factor.voiddeleteIndexRange(int r, int cl, int cu)Deletes all non-zero values of the given column range [cl,cu) in row r.static longestimateSizeInMemory(long nrows, long ncols, double sparsity)Get the estimated in-memory size of the sparse block in CSR with the given dimensions w/o accounting for overallocation.SparseRowget(int r)Get values of row r in the format of a sparse row.doubleget(int r, int c)Get value of matrix cell (r,c).int[]indexes()Get raw access to underlying array of column indices For use in GPU codeint[]indexes(int r)Get the sorted array of column indexes of all non-zero entries in row r.voidinitSparse(int rlen, int nnz, DataInput in)Initializes the CSR sparse block from an ordered input stream of sparse rows (rownnz, jv-pairs*).voidinitUltraSparse(int nnz, DataInput in)Initializes the CSR sparse block from an ordered input stream of ultra-sparse ijv triples.booleanisAllocated(int r)Indicates if the underlying data structure for a given row is already allocated.booleanisContiguous()Indicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.booleanisEmpty(int r)Get information if row r is empty, i.e., does not contain non-zero values.booleanisThreadSafe()Indicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.intnumRows()Get the number of rows in the sparse block.intpos(int r)Get the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).intposFIndexGT(int r, int c)Get position of first column index greater than column c in row r.intposFIndexGTE(int r, int c)Get position of first column index greater than or equal column c in row r.intposFIndexLTE(int r, int c)Get position of first column index lower than or equal column c in row r.voidreset()Clears the sparse block by deleting non-zero values.voidreset(int ennz, int maxnnz)Clears the sparse block by deleting non-zero values.voidreset(int r, int ennz, int maxnnz)Clears row r of the sparse block by deleting non-zero values.int[]rowPointers()Get raw access to underlying array of row pointers For use in GPU codebooleanset(int r, int c, double v)Set the value of a matrix cell (r,c).voidset(int r, SparseRow row, boolean deep)Set the values of row r to the given sparse row.voidsetIndexRange(int r, int cl, int cu, double[] v, int[] vix, int vpos, int vlen)Sets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r.voidsetIndexRange(int r, int cl, int cu, double[] v, int vix, int vlen)Sets a dense array of non-zeros values into the column range [cl,cu) in row r.voidsetIndexRange(int rl, int ru, int cl, int cu, double[] v, int vix, int vlen)Inserts a sorted row-major array of non-zero values into the row and column range [rl,ru) and [cl,cu).voidsetIndexRange(int rl, int ru, int cl, int cu, SparseBlock sb)Inserts a sparse block into the row and column range [rl,ru) and [cl,cu).longsize()Get the number of non-zero values in the sparse block.intsize(int r)Get the number of non-zero values in row r.longsize(int rl, int ru)Get the number of non-zeros values in the row range of [rl, ru).longsize(int rl, int ru, int cl, int cu)Get the number of non-zeros values in the row and column range of [rl/cl, ru/cu);voidsort()Sort all non-zero value/index pairs of the sparse block by row and column index.voidsort(int r)Sort all non-zero value/index pairs of row r column index.StringtoString()double[]values()Get raw access to underlying array of values For use in GPU codedouble[]values(int r)Get the array of all non-zero entries in row r, sorted by their column indexes.-
Methods inherited from class org.apache.sysds.runtime.data.SparseBlock
getIterator, getIterator, getIterator, isAligned, isAligned
-
-
-
-
Constructor Detail
-
SparseBlockCSR
public SparseBlockCSR(int rlen)
-
SparseBlockCSR
public SparseBlockCSR(int rlen, int capacity)
-
SparseBlockCSR
public SparseBlockCSR(int rlen, int capacity, int size)
-
SparseBlockCSR
public SparseBlockCSR(int[] rowPtr, int[] colInd, double[] values, int nnz)
-
SparseBlockCSR
public SparseBlockCSR(SparseBlock sblock)
Copy constructor sparse block abstraction.- Parameters:
sblock- sparse block to copy
-
SparseBlockCSR
public SparseBlockCSR(SparseRow[] rows, int nnz)
Copy constructor old sparse row representation.- Parameters:
rows- array of sparse rowsnnz- number of non-zeroes
-
SparseBlockCSR
public SparseBlockCSR(int rows, int[] rowInd, int[] colInd, double[] values)Copy constructor for COO representation- Parameters:
rows- number of rowsrowInd- row indicescolInd- column indicesvalues- non zero values
-
SparseBlockCSR
public SparseBlockCSR(int rows, int nnz, int[] colInd)Copy constructor for given array of column indexes, which identifies rows by position and implies values of 1.- Parameters:
rows- number of rowsnnz- number of non-zeroscolInd- column indexes
-
-
Method Detail
-
initUltraSparse
public void initUltraSparse(int nnz, DataInput in) throws IOExceptionInitializes the CSR sparse block from an ordered input stream of ultra-sparse ijv triples.- Parameters:
nnz- number of non-zeros to readin- data input stream of ijv triples, ordered by ij- Throws:
IOException- if deserialization error occurs
-
initSparse
public void initSparse(int rlen, int nnz, DataInput in) throws IOExceptionInitializes the CSR sparse block from an ordered input stream of sparse rows (rownnz, jv-pairs*).- Parameters:
rlen- number of rowsnnz- number of non-zeros to readin- data input stream of sparse rows, ordered by i- Throws:
IOException- if deserialization error occurs
-
estimateSizeInMemory
public static long estimateSizeInMemory(long nrows, long ncols, double sparsity)Get the estimated in-memory size of the sparse block in CSR with the given dimensions w/o accounting for overallocation.- Parameters:
nrows- number of rowsncols- number of columnssparsity- sparsity ratio- Returns:
- memory estimate
-
rowPointers
public int[] rowPointers()
Get raw access to underlying array of row pointers For use in GPU code- Returns:
- array of row pointers
-
indexes
public int[] indexes()
Get raw access to underlying array of column indices For use in GPU code- Returns:
- array of column indexes
-
values
public double[] values()
Get raw access to underlying array of values For use in GPU code- Returns:
- array of values
-
allocate
public void allocate(int r)
Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r if necessary.- Specified by:
allocatein classSparseBlock- Parameters:
r- row index
-
allocate
public void allocate(int r, int nnz)Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r if necessary, w/ given size.- Specified by:
allocatein classSparseBlock- Parameters:
r- row indexnnz- number of non-zeros
-
allocate
public void allocate(int r, int ennz, int maxnnz)Description copied from class:SparseBlockAllocate the underlying data structure holding non-zero values of row r w/ the specified estimated nnz and max nnz.- Specified by:
allocatein classSparseBlock- Parameters:
r- row indexennz- estimated non-zerosmaxnnz- max non-zeros
-
compact
public void compact(int r)
Description copied from class:SparseBlockRe-allocate physical row if physical size exceeds logical size plus resize factor.- Specified by:
compactin classSparseBlock- Parameters:
r- row index
-
compact
public void compact()
-
numRows
public int numRows()
Description copied from class:SparseBlockGet the number of rows in the sparse block.- Specified by:
numRowsin classSparseBlock- Returns:
- number of rows
-
isThreadSafe
public boolean isThreadSafe()
Description copied from class:SparseBlockIndicates if the underlying implementation allows thread-safe row updates if concurrent threads update disjoint rows.- Specified by:
isThreadSafein classSparseBlock- Returns:
- true if thread-safe row updates
-
isContiguous
public boolean isContiguous()
Description copied from class:SparseBlockIndicates if the underlying data structures returned by values and indexes are contiguous arrays, which can be exploited for more efficient operations.- Specified by:
isContiguousin classSparseBlock- Returns:
- true if underlying data structures are contiguous arrays
-
isAllocated
public boolean isAllocated(int r)
Description copied from class:SparseBlockIndicates if the underlying data structure for a given row is already allocated.- Specified by:
isAllocatedin classSparseBlock- Parameters:
r- row index- Returns:
- true if already allocated
-
reset
public void reset()
Description copied from class:SparseBlockClears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.- Specified by:
resetin classSparseBlock
-
reset
public void reset(int ennz, int maxnnz)Description copied from class:SparseBlockClears the sparse block by deleting non-zero values. After this call all size() calls are guaranteed to return 0.- Specified by:
resetin classSparseBlock- Parameters:
ennz- estimated non-zerosmaxnnz- max non-zeros
-
reset
public void reset(int r, int ennz, int maxnnz)Description copied from class:SparseBlockClears row r of the sparse block by deleting non-zero values. After this call size(r) is guaranteed to return 0.- Specified by:
resetin classSparseBlock- Parameters:
r- row indexennz- estimated non-zerosmaxnnz- max non-zeros
-
size
public long size()
Description copied from class:SparseBlockGet the number of non-zero values in the sparse block.- Specified by:
sizein classSparseBlock- Returns:
- number of non-zero values in sparse block
-
size
public int size(int r)
Description copied from class:SparseBlockGet the number of non-zero values in row r.- Specified by:
sizein classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- number of non-zero values in row r
-
size
public long size(int rl, int ru)Description copied from class:SparseBlockGet the number of non-zeros values in the row range of [rl, ru).- Specified by:
sizein classSparseBlock- Parameters:
rl- row lower indexru- row upper index- Returns:
- number of non-zero values in the row range
-
size
public long size(int rl, int ru, int cl, int cu)Description copied from class:SparseBlockGet the number of non-zeros values in the row and column range of [rl/cl, ru/cu);- Specified by:
sizein classSparseBlock- Parameters:
rl- row lower indexru- row upper indexcl- column lower indexcu- column upper index- Returns:
- number of non-zero values in the row and column range
-
isEmpty
public boolean isEmpty(int r)
Description copied from class:SparseBlockGet information if row r is empty, i.e., does not contain non-zero values. Equivalent to size(r)==0. Users should do this check if it is unknown if the underlying row data structure is allocated.- Specified by:
isEmptyin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- true if row does not contain non-zero values
-
indexes
public int[] indexes(int r)
Description copied from class:SparseBlockGet the sorted array of column indexes of all non-zero entries in row r. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).- Specified by:
indexesin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- sorted array of column indexes
-
values
public double[] values(int r)
Description copied from class:SparseBlockGet the array of all non-zero entries in row r, sorted by their column indexes. Note that - for flexibility of the implementing format - the returned array may be larger, where the range for row r is given by [pos(r),pos(r)+size(r)).- Specified by:
valuesin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- array of all non-zero entries in row r sorted by column indexes
-
pos
public int pos(int r)
Description copied from class:SparseBlockGet the starting position of row r in the indexes/values arrays returned by indexes(r) and values(r).- Specified by:
posin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- starting position of row r
-
set
public boolean set(int r, int c, double v)Description copied from class:SparseBlockSet the value of a matrix cell (r,c). This might update an existing non-zero value, insert a new non-zero value, or delete a non-zero value.- Specified by:
setin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0v- zero or non-zero value- Returns:
- true, if number of non-zeros changed
-
add
public boolean add(int r, int c, double v)Description copied from class:SparseBlockAdd a value to a matrix cell (r,c). This might update an existing non-zero value, or insert a new non-zero value.- Specified by:
addin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0v- zero or non-zero value- Returns:
- true, if number of non-zeros changed
-
set
public void set(int r, SparseRow row, boolean deep)Description copied from class:SparseBlockSet the values of row r to the given sparse row. This might update existing non-zero values, insert a new row, or delete a row. NOTE: This method exists for incremental runtime integration and might be deleted in the future.- Specified by:
setin classSparseBlock- Parameters:
r- row index starting at 0row- sparse rowdeep- indicator to create deep copy of sparse row
-
append
public void append(int r, int c, double v)Description copied from class:SparseBlockAppend a value to the end of the physical representation. This should only be used for operations with sequential write pattern or if followed by a sort() operation. Note that this operation does not perform any matrix cell updates.- Specified by:
appendin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0v- zero or non-zero value
-
setIndexRange
public void setIndexRange(int r, int cl, int cu, double[] v, int vix, int vlen)Description copied from class:SparseBlockSets a dense array of non-zeros values into the column range [cl,cu) in row r. The passed value array may be larger and the relevant range is given by [vix,vix+len).- Specified by:
setIndexRangein classSparseBlock- Parameters:
r- row index starting at 0cl- lower column index starting at 0cu- upper column index starting at 0v- value arrayvix- start index in value arrayvlen- number of relevant values
-
setIndexRange
public void setIndexRange(int r, int cl, int cu, double[] v, int[] vix, int vpos, int vlen)Description copied from class:SparseBlockSets a sparse array of non-zeros values and indexes into the column range [cl,cu) in row r. The passed value array may be larger.- Specified by:
setIndexRangein classSparseBlock- Parameters:
r- row index starting at 0cl- lower column index starting at 0cu- upper column index starting at 0v- value arrayvix- column index arrayvpos- start index in value and index arraysvlen- number of relevant values
-
setIndexRange
public void setIndexRange(int rl, int ru, int cl, int cu, double[] v, int vix, int vlen)Inserts a sorted row-major array of non-zero values into the row and column range [rl,ru) and [cl,cu). Note: that this is a CSR-specific method to address performance issues due to repeated re-shifting on update-in-place.- Parameters:
rl- lower row index, starting at 0, inclusiveru- upper row index, starting at 0, exclusivecl- lower column index, starting at 0, inclusivecu- upper column index, starting at 0, exclusivev- right-hand-side dense blockvix- right-hand-side dense block indexvlen- right-hand-side dense block value length
-
setIndexRange
public void setIndexRange(int rl, int ru, int cl, int cu, SparseBlock sb)Inserts a sparse block into the row and column range [rl,ru) and [cl,cu). Note: that this is a CSR-specific method to address performance issues due to repeated re-shifting on update-in-place.- Parameters:
rl- lower row index, starting at 0, inclusiveru- upper row index, starting at 0, exclusivecl- lower column index, starting at 0, inclusivecu- upper column index, starting at 0, exclusivesb- right-hand-side sparse block
-
deleteIndexRange
public void deleteIndexRange(int r, int cl, int cu)Description copied from class:SparseBlockDeletes all non-zero values of the given column range [cl,cu) in row r.- Specified by:
deleteIndexRangein classSparseBlock- Parameters:
r- row index starting at 0cl- lower column index starting at 0cu- upper column index starting at 0
-
sort
public void sort()
Description copied from class:SparseBlockSort all non-zero value/index pairs of the sparse block by row and column index.- Specified by:
sortin classSparseBlock
-
sort
public void sort(int r)
Description copied from class:SparseBlockSort all non-zero value/index pairs of row r column index.- Specified by:
sortin classSparseBlock- Parameters:
r- row index starting at 0
-
get
public double get(int r, int c)Description copied from class:SparseBlockGet value of matrix cell (r,c). In case of non existing values this call returns 0.- Specified by:
getin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0- Returns:
- value of cell at position (r,c)
-
get
public SparseRow get(int r)
Description copied from class:SparseBlockGet values of row r in the format of a sparse row. NOTE: This method exists for incremental runtime integration and might be deleted in the future.- Specified by:
getin classSparseBlock- Parameters:
r- row index starting at 0- Returns:
- values of row r as a sparse row
-
posFIndexLTE
public int posFIndexLTE(int r, int c)Description copied from class:SparseBlockGet position of first column index lower than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.- Specified by:
posFIndexLTEin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0- Returns:
- position of the first column index lower than or equal to column c in row r
-
posFIndexGTE
public int posFIndexGTE(int r, int c)Description copied from class:SparseBlockGet position of first column index greater than or equal column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1. Note if CSR the pos(r) is subtracted from the result.- Specified by:
posFIndexGTEin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0- Returns:
- position of the first column index greater than or equal to column c in row r
-
posFIndexGT
public int posFIndexGT(int r, int c)Description copied from class:SparseBlockGet position of first column index greater than column c in row r. The position is relative to the indexes/values arrays returned by indexes(r) and values(r). If no such value exists, this call returns -1.- Specified by:
posFIndexGTin classSparseBlock- Parameters:
r- row index starting at 0c- column index starting at 0- Returns:
- position of the first column index greater than column c in row r
-
toString
public String toString()
- Specified by:
toStringin classSparseBlock
-
checkValidity
public boolean checkValidity(int rlen, int clen, long nnz, boolean strict)Description copied from class:SparseBlockValidate the correctness of the internal data structures of the different sparse block implementations.- Specified by:
checkValidityin classSparseBlock- Parameters:
rlen- number of rowsclen- number of columnsnnz- number of non zerosstrict- enforce optional properties- Returns:
- true if the sparse block is valid wrt the corresponding format such as COO, CSR, MCSR.
-
-