-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP / RFC: Define IO reading interface #57982
base: master
Are you sure you want to change the base?
Conversation
base/io.jl
Outdated
GC.@preserve ref unsafe_read(io, Ptr{UInt8}(pointer(ref))::Ptr{UInt8}, nbytes) | ||
end | ||
|
||
function unsafe_read(io::IO, dst::Ptr{UInt8}, nbytes::UInt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tricky because there is an existing fallback method that uses only read(io, UInt8)
.
If I understand correctly, fillbuffer
can fallback to return 0
and getbuffer
can fallback to return an empty vector for IO
without an underlying buffer. This would hit the isempty(buf) && iszero(nfilled)
case.
Currently, you have this throw an EOFError
, but you could instead do:
unsafe_store!(dst, read(io, UInt8)::UInt8)
dst += 1
nbytes -= 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fillbuffer
can fall back to 0, but getbuffer
should only be implemented if the IO is buffered. And having it buffered implies that when the buffer is empty, and fillbuffer
returns 0, the IO is EOF. I'll clarify the documentation.
W.r.t the fallback calling read(io, UInt8)
you're right. I just drew out the call graph for the current generic IO functions, and it's not too complex, actually. All reading functions fall back to read(io, UInt8)
.
I'm thinking of circumventing this by adding a check similar to this:
if readbuffering(typeof(io)) == NotBuffered() || hasmethod(getbuffer, Tuple{typeof(io)})
# use methods relying on new interface
else
# use method relying only on read(io, UInt8)
end
This will work, because, since IOs are buffered by default, IOs that have either opted out of buffering, or implemented getbuffer
must be aware of the new interface. It's an ugly solution, but it'll work.
Another consideration for this function specifically, is whether we can write a fast, generic fallback for unsafe_read(::IO, src::Any, ::UInt)
. To do this, we need to be able to dispatch on whether we can write to src
using a pointer, which we don't currently have any abstractions for. This is a hobby horse of mine, but for a reason; you really run into it again and again. I hope to be able to address this in this PR.
base/io.jl
Outdated
isempty(v) && return 0 | ||
buffer = @something get_nonempty_reading_buffer(io) return 0 | ||
mn = min(length(v), length(buffer)) | ||
copyto!(v, firstindex(v), buffer, 1, mn) | ||
mn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, this would somehow fall back to readbytes!
to work well with existing IO types. For example, you could check isnothing(buffer) && !eof(io)
(which should only happen for legacy IO types) and then fallback to readbytes!
in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: This is a new function, so I think it would actually be a boon if it failed for old IOs. That will push people to implement this new API for old IO types, without breaking any existing code.
It will also simplify the implementation because we don't have to do hacky workarounds to support IOs which don't adhere to the (new) interface.
This is a work in progress proposal to define the interface of the abstract type
IO
, and provide a robust, low-level API that makes it possible to implement efficient, generic IO operations.Discussion is very welcome, see https://hackmd.io/UgeBwtIkTTipkgZSQnXc8w#Discussion for more details and discussion of design decisions.
Having used this API for some of my own work, I must say I really like it (well, most of it).
This PR will, in time, have several parts:
Base.IOBuffer
as a test case for the new interface, to see if a) the API is pleasant, b) the performance is expected, and most importantly, c) this is non-breaking.Test
and test the generic definitions using those, to make sure the generic fallbacks work for buffered IOs, unbuffered IOs, and IOs that do not implement this new interface at all (i.e. previously existing IOs)TODO
IOBuffer
using these new abstractions as a test case for this PR, and fix what breaksDecisions
See https://hackmd.io/UgeBwtIkTTipkgZSQnXc8w for points of discussion
Timeline
I hope to finish this before the feature freeze of 1.13, but I'd rather have this be done well than done soon.
Things left out of this PR
IOBuffer
, being purely in-memory, does not need to work with cancellation. Therefore, this can be left to a future PR.read(::IO, ::Int)
are one of the main uses of IO. However, this is a different problem with a different design space.Closes #55835
Closes #47771