1. Introduction
This section is informative.
Web applications should have the ability to manipulate as wide as possible a range of user input, including files that a user may wish to upload to a remote server or manipulate inside a rich web application. This specification defines the basic representations for files, lists of files, errors raised by access to files, and programmatic ways to read files. Additionally, this specification also defines an interface that represents "raw data" which can be asynchronously processed on the main thread of conforming user agents. The interfaces and API defined in this specification can be used with other interfaces and APIs exposed to the web platform.
The File
interface represents file data typically obtained from the underlying file system,
and the Blob
interface
("Binary Large Object" - a name originally introduced to web APIs in Google Gears)
represents immutable raw data. File
or Blob
reads should happen asynchronously on the main thread,
with an optional synchronous API used within threaded web applications.
An asynchronous API for reading files prevents blocking and UI "freezing" on a user agent’s main thread.
This specification defines an asynchronous API based on an event model to read and access a File
or Blob
’s data.
A FileReader
object provides asynchronous read methods to access that file’s data
through event handler content attributes and the firing of events.
The use of events and event handlers allows separate code blocks the ability
to monitor the progress of the read (which is particularly useful for remote drives or mounted drives,
where file access performance may vary from local drives)
and error conditions that may arise during reading of a file.
An example will be illustrative.
function startRead() { // obtain input element through DOM var file= document. getElementById( 'file' ). files[ 0 ]; if ( file){ getAsText( file); } } function getAsText( readFile) { var reader= new FileReader(); // Read file into memory as UTF-16 reader. readAsText( readFile, "UTF-16" ); // Handle progress, success, and errors reader. onprogress= updateProgress; reader. onload= loaded; reader. onerror= errorHandler; } function updateProgress( evt) { if ( evt. lengthComputable) { // evt.loaded and evt.total are ProgressEvent properties var loaded= ( evt. loaded/ evt. total); if ( loaded< 1 ) { // Increase the prog bar length // style.width = (loaded * 200) + "px"; } } } function loaded( evt) { // Obtain the read file data var fileString= evt. target. result; // Handle UTF-16 file dump if ( utils. regexp. isChinese( fileString)) { //Chinese Characters + Name validation } else { // run other charset test } // xhr.send(fileString) } function errorHandler( evt) { if ( evt. target. error. name== "NotReadableError" ) { // The file could not be read } }
2. Terminology and Algorithms
When this specification says to terminate an algorithm the user agent must terminate the algorithm after finishing the step it is on.
Asynchronous read methods defined in this specification may return before the algorithm in question is terminated,
and can be terminated by an abort()
call.
The algorithms and steps in this specification use the following mathematical operations:
-
max(a,b) returns the maximum of a and b, and is always performed on integers as they are defined in WebIDL [WebIDL]; in the case of max(6,4) the result is 6. This operation is also defined in ECMAScript [ECMA-262].
-
min(a,b) returns the minimum of a and b, and is always performed on integers as they are defined in WebIDL [WebIDL]; in the case of min(6,4) the result is 4. This operation is also defined in ECMAScript [ECMA-262].
-
Mathematical comparisons such as < (less than), ≤ (less than or equal to), and > (greater than) are as in ECMAScript [ECMA-262].
The term Unix Epoch is used in this specification to refer to the time 00:00:00 UTC on January 1 1970
(or 1970-01-01T00:00:00Z ISO 8601);
this is the same time that is conceptually "0
" in ECMA-262 [ECMA-262].
Blob
blob, start, end, and contentType is used to refer to the following
steps and returns a new Blob
containing the bytes ranging from the start parameter
up to but not including the end parameter. It must act as follows:
-
Let originalSize be blob’s
size
. -
The start parameter, if non-null, is a value for the start point of a slice blob call, and must be treated as a byte-order position, with the zeroth position representing the first byte. User agents must normalize start according to the following:
- If start is null, let relativeStart be 0.
- If start is negative, let relativeStart be
max((originalSize + start), 0)
. - Otherwise, let relativeStart be
min(start, originalSize)
.
-
The end parameter, if non-null. is a value for the end point of a slice blob call. User agents must normalize end according to the following:
- If end is null, let relativeEnd be originalSize.
- If end is negative, let relativeEnd be
max((originalSize + end), 0)
. - Otherwise, let relativeEnd be
min(end, originalSize)
.
-
The contentType parameter, if non-null, is used to set the ASCII-encoded string in lower case representing the media type of the
Blob
. User agents must normalize contentType according to the following:- If contentType is null, let relativeContentType be set to the empty string.
-
Otherwise, let relativeContentType be set to contentType and run the
substeps below:
-
If relativeContentType contains any characters outside the range of U+0020 to U+007E, then set relativeContentType to the empty string and return from these substeps.
-
Convert every character in relativeContentType to ASCII lowercase.
-
-
Let span be
max((relativeEnd - relativeStart), 0)
. -
Return a new
Blob
object S with the following characteristics:
3. The Blob Interface and Binary Data
A Blob
object refers to a byte sequence,
and has a size
attribute which is the total number of bytes in the byte sequence,
and a type
attribute,
which is an ASCII-encoded string in lower case representing the media type of the byte sequence.
Each Blob
must have an internal snapshot state,
which must be initially set to the state of the underlying storage,
if any such underlying storage exists.
Further normative definition of snapshot state can be found for File
s.
[Exposed =(Window ,Worker ),Serializable ]interface {
Blob (
constructor optional sequence <BlobPart >blobParts ,optional BlobPropertyBag = {});
options readonly attribute unsigned long long size ;readonly attribute DOMString type ; // slice Blob into byte-ranged chunksBlob slice (optional [Clamp ]long long ,
start optional [Clamp ]long long ,
end optional DOMString ); // read from the Blob. [
contentType NewObject ]ReadableStream stream (); [NewObject ]Promise <USVString >text (); [NewObject ]Promise <ArrayBuffer >arrayBuffer (); [NewObject ]Promise <Uint8Array >bytes (); };enum {
EndingType ,
"transparent" };
"native" dictionary {
BlobPropertyBag DOMString type = "";EndingType endings = "transparent"; };typedef (BufferSource or Blob or USVString );
BlobPart
Blob
objects are serializable objects. Their serialization steps,
given value and serialized, are:
-
Set serialized.[[SnapshotState]] to value’s snapshot state.
-
Set serialized.[[ByteSequence]] to value’s underlying byte sequence.
Their deserialization step, given serialized and value, are:
-
Set value’s snapshot state to serialized.[[SnapshotState]].
-
Set