1 2 /* 3 * Hunt - A refined core library for D programming language. 4 * 5 * Copyright (C) 2018-2019 HuntLabs 6 * 7 * Website: https://www.huntlabs.net/ 8 * 9 * Licensed under the Apache-2.0 License. 10 * 11 */ 12 13 module hunt.io.DataInput; 14 /** 15 * The {@code DataInput} interface provides 16 * for reading bytes from a binary stream and 17 * reconstructing from them data in any of 18 * the Java primitive types. There is also 19 * a 20 * facility for reconstructing a {@code string} 21 * from data in 22 * <a href="#modified-utf-8">modified UTF-8</a> 23 * format. 24 * <p> 25 * It is generally true of all the reading 26 * routines in this interface that if end of 27 * file is reached before the desired number 28 * of bytes has been read, an {@code EOFException} 29 * (which is a kind of {@code IOException}) 30 * is thrown. If any byte cannot be read for 31 * any reason other than end of file, an {@code IOException} 32 * other than {@code EOFException} is 33 * thrown. In particular, an {@code IOException} 34 * may be thrown if the input stream has been 35 * closed. 36 * 37 * <h3><a id="modified-utf-8">Modified UTF-8</a></h3> 38 * <p> 39 * Implementations of the DataInput and DataOutput interfaces represent 40 * Unicode strings in a format that is a slight modification of UTF-8. 41 * (For information regarding the standard UTF-8 format, see section 42 * <i>3.9 Unicode Encoding Forms</i> of <i>The Unicode Standard, Version 43 * 4.0</i>) 44 * 45 * <ul> 46 * <li>Characters in the range {@code '\u005Cu0001'} to 47 * {@code '\u005Cu007F'} are represented by a single byte. 48 * <li>The null character {@code '\u005Cu0000'} and characters 49 * in the range {@code '\u005Cu0080'} to {@code '\u005Cu07FF'} are 50 * represented by a pair of bytes. 51 * <li>Characters in the range {@code '\u005Cu0800'} 52 * to {@code '\u005CuFFFF'} are represented by three bytes. 53 * </ul> 54 * 55 * <table class="plain" style="margin-left:2em;"> 56 * <caption>Encoding of UTF-8 values</caption> 57 * <thead> 58 * <tr> 59 * <th scope="col" rowspan="2">Value</th> 60 * <th scope="col" rowspan="2">Byte</th> 61 * <th scope="col" colspan="8" id="bit_a">Bit Values</th> 62 * </tr> 63 * <tr> 64 * <!-- Value --> 65 * <!-- Byte --> 66 * <th scope="col" style="width:3em"> 7 </th> 67 * <th scope="col" style="width:3em"> 6 </th> 68 * <th scope="col" style="width:3em"> 5 </th> 69 * <th scope="col" style="width:3em"> 4 </th> 70 * <th scope="col" style="width:3em"> 3 </th> 71 * <th scope="col" style="width:3em"> 2 </th> 72 * <th scope="col" style="width:3em"> 1 </th> 73 * <th scope="col" style="width:3em"> 0 </th> 74 * </thead> 75 * <tbody> 76 * <tr> 77 * <th scope="row" style="text-align:left; font-weight:normal"> 78 * {@code \u005Cu0001} to {@code \u005Cu007F} </th> 79 * <th scope="row" style="font-weight:normal; text-align:center"> 1 </th> 80 * <td style="text-align:center">0 81 * <td colspan="7" style="text-align:right; padding-right:6em">bits 6-0 82 * </tr> 83 * <tr> 84 * <th scope="row" rowspan="2" style="text-align:left; font-weight:normal"> 85 * {@code \u005Cu0000},<br> 86 * {@code \u005Cu0080} to {@code \u005Cu07FF} </th> 87 * <th scope="row" style="font-weight:normal; text-align:center"> 1 </th> 88 * <td style="text-align:center">1 89 * <td style="text-align:center">1 90 * <td style="text-align:center">0 91 * <td colspan="5" style="text-align:right; padding-right:6em">bits 10-6 92 * </tr> 93 * <tr> 94 * <!-- (value) --> 95 * <th scope="row" style="font-weight:normal; text-align:center"> 2 </th> 96 * <td style="text-align:center">1 97 * <td style="text-align:center">0 98 * <td colspan="6" style="text-align:right; padding-right:6em">bits 5-0 99 * </tr> 100 * <tr> 101 * <th scope="row" rowspan="3" style="text-align:left; font-weight:normal"> 102 * {@code \u005Cu0800} to {@code \u005CuFFFF} </th> 103 * <th scope="row" style="font-weight:normal; text-align:center"> 1 </th> 104 * <td style="text-align:center">1 105 * <td style="text-align:center">1 106 * <td style="text-align:center">1 107 * <td style="text-align:center">0 108 * <td colspan="4" style="text-align:right; padding-right:6em">bits 15-12 109 * </tr> 110 * <tr> 111 * <!-- (value) --> 112 * <th scope="row" style="font-weight:normal; text-align:center"> 2 </th> 113 * <td style="text-align:center">1 114 * <td style="text-align:center">0 115 * <td colspan="6" style="text-align:right; padding-right:6em">bits 11-6 116 * </tr> 117 * <tr> 118 * <!-- (value) --> 119 * <th scope="row" style="font-weight:normal; text-align:center"> 3 </th> 120 * <td style="text-align:center">1 121 * <td style="text-align:center">0 122 * <td colspan="6" style="text-align:right; padding-right:6em">bits 5-0 123 * </tr> 124 * </tbody> 125 * </table> 126 * 127 * <p> 128 * The differences between this format and the 129 * standard UTF-8 format are the following: 130 * <ul> 131 * <li>The null byte {@code '\u005Cu0000'} is encoded in 2-byte format 132 * rather than 1-byte, so that the encoded strings never have 133 * embedded nulls. 134 * <li>Only the 1-byte, 2-byte, and 3-byte formats are used. 135 * <li><a href="../lang/Character.html#unicode">Supplementary characters</a> 136 * are represented in the form of surrogate pairs. 137 * </ul> 138 * @author Frank Yellin 139 * @see java.io.DataInputStream 140 * @see java.io.DataOutput 141 * @since 1.0 142 */ 143 public 144 interface DataInput { 145 /** 146 * Reads some bytes from an input 147 * stream and stores them into the buffer 148 * array {@code b}. The number of bytes 149 * read is equal 150 * to the length of {@code b}. 151 * <p> 152 * This method blocks until one of the 153 * following conditions occurs: 154 * <ul> 155 * <li>{@code b.length} 156 * bytes of input data are available, in which 157 * case a normal return is made. 158 * 159 * <li>End of 160 * file is detected, in which case an {@code EOFException} 161 * is thrown. 162 * 163 * <li>An I/O error occurs, in 164 * which case an {@code IOException} other 165 * than {@code EOFException} is thrown. 166 * </ul> 167 * <p> 168 * If {@code b} is {@code null}, 169 * a {@code NullPointerException} is thrown. 170 * If {@code b.length} is zero, then 171 * no bytes are read. Otherwise, the first 172 * byte read is stored into element {@code b[0]}, 173 * the next one into {@code b[1]}, and 174 * so on. 175 * If an exception is thrown from 176 * this method, then it may be that some but 177 * not all bytes of {@code b} have been 178 * updated with data from the input stream. 179 * 180 * @param b the buffer into which the data is read. 181 * @throws NullPointerException if {@code b} is {@code null}. 182 * @throws EOFException if this stream reaches the end before reading 183 * all the bytes. 184 * @throws IOException if an I/O error occurs. 185 */ 186 void readFully(byte[] b); 187 188 /** 189 * 190 * Reads {@code len} 191 * bytes from 192 * an input stream. 193 * <p> 194 * This method 195 * blocks until one of the following conditions 196 * occurs: 197 * <ul> 198 * <li>{@code len} bytes 199 * of input data are available, in which case 200 * a normal return is made. 201 * 202 * <li>End of file 203 * is detected, in which case an {@code EOFException} 204 * is thrown. 205 * 206 * <li>An I/O error occurs, in 207 * which case an {@code IOException} other 208 * than {@code EOFException} is thrown. 209 * </ul> 210 * <p> 211 * If {@code b} is {@code null}, 212 * a {@code NullPointerException} is thrown. 213 * If {@code off} is negative, or {@code len} 214 * is negative, or {@code off+len} is 215 * greater than the length of the array {@code b}, 216 * then an {@code IndexOutOfBoundsException} 217 * is thrown. 218 * If {@code len} is zero, 219 * then no bytes are read. Otherwise, the first 220 * byte read is stored into element {@code b[off]}, 221 * the next one into {@code b[off+1]}, 222 * and so on. The number of bytes read is, 223 * at most, equal to {@code len}. 224 * 225 * @param b the buffer into which the data is read. 226 * @param off an int specifying the offset in the data array {@code b}. 227 * @param len an int specifying the number of bytes to read. 228 * @throws NullPointerException if {@code b} is {@code null}. 229 * @throws IndexOutOfBoundsException if {@code off} is negative, 230 * {@code len} is negative, or {@code len} is greater than 231 * {@code b.length - off}. 232 * @throws EOFException if this stream reaches the end before reading 233 * all the bytes. 234 * @throws IOException if an I/O error occurs. 235 */ 236 void readFully(byte[] b, int off, int len) ; 237 238 /** 239 * Makes an attempt to skip over 240 * {@code n} bytes 241 * of data from the input 242 * stream, discarding the skipped bytes. However, 243 * it may skip 244 * over some smaller number of 245 * bytes, possibly zero. This may result from 246 * any of a 247 * number of conditions; reaching 248 * end of file before {@code n} bytes 249 * have been skipped is 250 * only one possibility. 251 * This method never throws an {@code EOFException}. 252 * The actual 253 * number of bytes skipped is returned. 254 * 255 * @param n the number of bytes to be skipped. 256 * @return the number of bytes actually skipped. 257 * @exception IOException if an I/O error occurs. 258 */ 259 int skipBytes(int n) ; 260 261 /** 262 * Reads one input byte and returns 263 * {@code true} if that byte is nonzero, 264 * {@code false} if that byte is zero. 265 * This method is suitable for reading 266 * the byte written by the {@code writeBoolean} 267 * method of interface {@code DataOutput}. 268 * 269 * @return the {@code bool} value read. 270 * @exception EOFException if this stream reaches the end before reading 271 * all the bytes. 272 * @exception IOException if an I/O error occurs. 273 */ 274 bool readBoolean() ; 275 276 /** 277 * Reads and returns one input byte. 278 * The byte is treated as a signed value in 279 * the range {@code -128} through {@code 127}, 280 * inclusive. 281 * This method is suitable for 282 * reading the byte written by the {@code writeByte} 283 * method of interface {@code DataOutput}. 284 * 285 * @return the 8-bit value read. 286 * @exception EOFException if this stream reaches the end before reading 287 * all the bytes. 288 * @exception IOException if an I/O error occurs. 289 */ 290 byte readByte() ; 291 292 /** 293 * Reads one input byte, zero-extends 294 * it to type {@code int}, and returns 295 * the result, which is therefore in the range 296 * {@code 0} 297 * through {@code 255}. 298 * This method is suitable for reading 299 * the byte written by the {@code writeByte} 300 * method of interface {@code DataOutput} 301 * if the argument to {@code writeByte} 302 * was intended to be a value in the range 303 * {@code 0} through {@code 255}. 304 * 305 * @return the unsigned 8-bit value read. 306 * @exception EOFException if this stream reaches the end before reading 307 * all the bytes. 308 * @exception IOException if an I/O error occurs. 309 */ 310 int readUnsignedByte() ; 311 312 /** 313 * Reads two input bytes and returns 314 * a {@code short} value. Let {@code a} 315 * be the first byte read and {@code b} 316 * be the second byte. The value 317 * returned 318 * is: 319 * <pre>{@code (short)((a << 8) | (b & 0xff)) 320 * }</pre> 321 * This method 322 * is suitable for reading the bytes written 323 * by the {@code writeShort} method of 324 * interface {@code DataOutput}. 325 * 326 * @return the 16-bit value read. 327 * @exception EOFException if this stream reaches the end before reading 328 * all the bytes. 329 * @exception IOException if an I/O error occurs. 330 */ 331 short readShort() ; 332 333 /** 334 * Reads two input bytes and returns 335 * an {@code int} value in the range {@code 0} 336 * through {@code 65535}. Let {@code a} 337 * be the first byte read and 338 * {@code b} 339 * be the second byte. The value returned is: 340 * <pre>{@code (((a & 0xff) << 8) | (b & 0xff)) 341 * }</pre> 342 * This method is suitable for reading the bytes 343 * written by the {@code writeShort} method 344 * of interface {@code DataOutput} if 345 * the argument to {@code writeShort} 346 * was intended to be a value in the range 347 * {@code 0} through {@code 65535}. 348 * 349 * @return the unsigned 16-bit value read. 350 * @exception EOFException if this stream reaches the end before reading 351 * all the bytes. 352 * @exception IOException if an I/O error occurs. 353 */ 354 int readUnsignedShort() ; 355 356 /** 357 * Reads two input bytes and returns a {@code char} value. 358 * Let {@code a} 359 * be the first byte read and {@code b} 360 * be the second byte. The value 361 * returned is: 362 * <pre>{@code (char)((a << 8) | (b & 0xff)) 363 * }</pre> 364 * This method 365 * is suitable for reading bytes written by 366 * the {@code writeChar} method of interface 367 * {@code DataOutput}. 368 * 369 * @return the {@code char} value read. 370 * @exception EOFException if this stream reaches the end before reading 371 * all the bytes. 372 * @exception IOException if an I/O error occurs. 373 */ 374 char readChar() ; 375 376 /** 377 * Reads four input bytes and returns an 378 * {@code int} value. Let {@code a-d} 379 * be the first through fourth bytes read. The value returned is: 380 * <pre>{@code 381 * (((a & 0xff) << 24) | ((b & 0xff) << 16) | 382 * ((c & 0xff) << 8) | (d & 0xff)) 383 * }</pre> 384 * This method is suitable 385 * for reading bytes written by the {@code writeInt} 386 * method of interface {@code DataOutput}. 387 * 388 * @return the {@code int} value read. 389 * @exception EOFException if this stream reaches the end before reading 390 * all the bytes. 391 * @exception IOException if an I/O error occurs. 392 */ 393 int readInt() ; 394 395 /** 396 * Reads eight input bytes and returns 397 * a {@code long} value. Let {@code a-h} 398 * be the first through eighth bytes read. 399 * The value returned is: 400 * <pre>{@code 401 * (((long)(a & 0xff) << 56) | 402 * ((long)(b & 0xff) << 48) | 403 * ((long)(c & 0xff) << 40) | 404 * ((long)(d & 0xff) << 32) | 405 * ((long)(e & 0xff) << 24) | 406 * ((long)(f & 0xff) << 16) | 407 * ((long)(g & 0xff) << 8) | 408 * ((long)(h & 0xff))) 409 * }</pre> 410 * <p> 411 * This method is suitable 412 * for reading bytes written by the {@code writeLong} 413 * method of interface {@code DataOutput}. 414 * 415 * @return the {@code long} value read. 416 * @exception EOFException if this stream reaches the end before reading 417 * all the bytes. 418 * @exception IOException if an I/O error occurs. 419 */ 420 long readLong() ; 421 422 /** 423 * Reads four input bytes and returns 424 * a {@code float} value. It does this 425 * by first constructing an {@code int} 426 * value in exactly the manner 427 * of the {@code readInt} 428 * method, then converting this {@code int} 429 * value to a {@code float} in 430 * exactly the manner of the method {@code Float.intBitsToFloat}. 431 * This method is suitable for reading 432 * bytes written by the {@code writeFloat} 433 * method of interface {@code DataOutput}. 434 * 435 * @return the {@code float} value read. 436 * @exception EOFException if this stream reaches the end before reading 437 * all the bytes. 438 * @exception IOException if an I/O error occurs. 439 */ 440 float readFloat() ; 441 442 /** 443 * Reads eight input bytes and returns 444 * a {@code double} value. It does this 445 * by first constructing a {@code long} 446 * value in exactly the manner 447 * of the {@code readLong} 448 * method, then converting this {@code long} 449 * value to a {@code double} in exactly 450 * the manner of the method {@code Double.longBitsToDouble}. 451 * This method is suitable for reading 452 * bytes written by the {@code writeDouble} 453 * method of interface {@code DataOutput}. 454 * 455 * @return the {@code double} value read. 456 * @exception EOFException if this stream reaches the end before reading 457 * all the bytes. 458 * @exception IOException if an I/O error occurs. 459 */ 460 double readDouble() ; 461 462 /** 463 * Reads the next line of text from the input stream. 464 * It reads successive bytes, converting 465 * each byte separately into a character, 466 * until it encounters a line terminator or 467 * end of 468 * file; the characters read are then 469 * returned as a {@code string}. Note 470 * that because this 471 * method processes bytes, 472 * it does not support input of the full Unicode 473 * character set. 474 * <p> 475 * If end of file is encountered 476 * before even one byte can be read, then {@code null} 477 * is returned. Otherwise, each byte that is 478 * read is converted to type {@code char} 479 * by zero-extension. If the character {@code '\n'} 480 * is encountered, it is discarded and reading 481 * ceases. If the character {@code '\r'} 482 * is encountered, it is discarded and, if 483 * the following byte converts  to the 484 * character {@code '\n'}, then that is 485 * discarded also; reading then ceases. If 486 * end of file is encountered before either 487 * of the characters {@code '\n'} and 488 * {@code '\r'} is encountered, reading 489 * ceases. Once reading has ceased, a {@code string} 490 * is returned that contains all the characters 491 * read and not discarded, taken in order. 492 * Note that every character in this string 493 * will have a value less than {@code \u005Cu0100}, 494 * that is, {@code (char)256}. 495 * 496 * @return the next line of text from the input stream, 497 * or {@code null} if the end of file is 498 * encountered before a byte can be read. 499 * @exception IOException if an I/O error occurs. 500 */ 501 string readLine() ; 502 503 /** 504 * Reads in a string that has been encoded using a 505 * <a href="#modified-utf-8">modified UTF-8</a> 506 * format. 507 * The general contract of {@code readUTF} 508 * is that it reads a representation of a Unicode 509 * character string encoded in modified 510 * UTF-8 format; this string of characters 511 * is then returned as a {@code string}. 512 * <p> 513 * First, two bytes are read and used to 514 * construct an unsigned 16-bit integer in 515 * exactly the manner of the {@code readUnsignedShort} 516 * method . This integer value is called the 517 * <i>UTF length</i> and specifies the number 518 * of additional bytes to be read. These bytes 519 * are then converted to characters by considering 520 * them in groups. The length of each group 521 * is computed from the value of the first 522 * byte of the group. The byte following a 523 * group, if any, is the first byte of the 524 * next group. 525 * <p> 526 * If the first byte of a group 527 * matches the bit pattern {@code 0xxxxxxx} 528 * (where {@code x} means "may be {@code 0} 529 * or {@code 1}"), then the group consists 530 * of just that byte. The byte is zero-extended 531 * to form a character. 532 * <p> 533 * If the first byte 534 * of a group matches the bit pattern {@code 110xxxxx}, 535 * then the group consists of that byte {@code a} 536 * and a second byte {@code b}. If there 537 * is no byte {@code b} (because byte 538 * {@code a} was the last of the bytes 539 * to be read), or if byte {@code b} does 540 * not match the bit pattern {@code 10xxxxxx}, 541 * then a {@code UTFDataFormatException} 542 * is thrown. Otherwise, the group is converted 543 * to the character: 544 * <pre>{@code (char)(((a & 0x1F) << 6) | (b & 0x3F)) 545 * }</pre> 546 * If the first byte of a group 547 * matches the bit pattern {@code 1110xxxx}, 548 * then the group consists of that byte {@code a} 549 * and two more bytes {@code b} and {@code c}. 550 * If there is no byte {@code c} (because 551 * byte {@code a} was one of the last 552 * two of the bytes to be read), or either 553 * byte {@code b} or byte {@code c} 554 * does not match the bit pattern {@code 10xxxxxx}, 555 * then a {@code UTFDataFormatException} 556 * is thrown. Otherwise, the group is converted 557 * to the character: 558 * <pre>{@code 559 * (char)(((a & 0x0F) << 12) | ((b & 0x3F) << 6) | (c & 0x3F)) 560 * }</pre> 561 * If the first byte of a group matches the 562 * pattern {@code 1111xxxx} or the pattern 563 * {@code 10xxxxxx}, then a {@code UTFDataFormatException} 564 * is thrown. 565 * <p> 566 * If end of file is encountered 567 * at any time during this entire process, 568 * then an {@code EOFException} is thrown. 569 * <p> 570 * After every group has been converted to 571 * a character by this process, the characters 572 * are gathered, in the same order in which 573 * their corresponding groups were read from 574 * the input stream, to form a {@code string}, 575 * which is returned. 576 * <p> 577 * The {@code writeUTF} 578 * method of interface {@code DataOutput} 579 * may be used to write data that is suitable 580 * for reading by this method. 581 * @return a Unicode string. 582 * @exception EOFException if this stream reaches the end 583 * before reading all the bytes. 584 * @exception IOException if an I/O error occurs. 585 * @exception UTFDataFormatException if the bytes do not represent a 586 * valid modified UTF-8 encoding of a string. 587 */ 588 string readUTF() ; 589 }