CSC 142/Chapter 6: Difference between revisions
From charlesreid1
| Line 27: | Line 27: | ||
Done. | Done. | ||
==Section 6.1: File Reading Basics== | |||
===Definitions=== | |||
Definitions; | |||
* File | |||
* File extension | |||
* binary | |||
* ASCII | |||
* Checked exception | |||
* Throws clause | |||
===Material=== | |||
Examples of the deluge of data available: | |||
* Landmark-project (earthquakes, pollution, baserball, history, weather, etc) | |||
* Gutenberg - see ciphertexts | |||
* ncbi.nlm.nih.gov - biological/genomic data | |||
* IMDB | |||
* Fedstats.gov | |||
* US census | |||
* World bank | |||
* CIA world factbook | |||
Files and file objects: | |||
* Data stored on computer as files | |||
* Files have extensions | |||
* Files can be stored as text, or as binary | |||
* To deal with a file, use a File object | |||
* This provides various methods | |||
* Java API lookup/reference | |||
* Note: we aren't constructing a NEW FILE, we're constructing a new object to represent an existing file | |||
Reading files with scanner: | |||
* Useful methods of File objects: (see list) | |||
* File object is like a pipe: doesn't care much about what kind of fluid flowing thru, or where it comes from | |||
* File object is the delivery system | |||
* You can then pas sthe File object into a scanner | |||
* Again, scanner is like nozzle at end of pipe - does not care much about File type or details of File object, just like nozzle doesn't care about type of fluid | |||
* Need to deal with potential problems; file not there | |||
* Checked exception - like "check" in chess | |||
* Must be dealt with (can't just say, ignore and keep going) | |||
* To handle this exception, put the code that may cause the error into a throws clause | |||
Throws clause: diapers for your code | |||
More in throws/catch clauses: | |||
* You're anticipating a particular kind of mess | |||
* Like an if statement, for exceptions | |||
* If we see this kind of exception, catch it this particular way | |||
<pre> | |||
public static void main (String[] args) throws FileNotFoundException { | |||
... | |||
} | |||
</pre> | |||
Other exceptions: | |||
* If you reach the end of a file, then ask for more | |||
* NoSuchElementException | |||
A word on the correct way: | |||
<pre> | |||
Scanner input = new Scanner(new File("hamlet.txt")) | |||
</pre> | |||
versus the incorrect way: | |||
<pre> | |||
Scanner input = new Scanner("hamlet.txt") | |||
</pre> | |||
(Latter would be like saying, a file with the literal contents "hamlet.txt") | |||
NOTE: This is overloading in action (Scanner can take multiple data types) | |||
==Section 6.2: Token-Based Processing== | |||
===Definitions=== | |||
Definitions: | |||
* Token-baesd processing | |||
* Input cursor | |||
* Consuming input | |||
* File path | |||
* Current directory | |||
Token - a single chunk of letters or character data | |||
* Usually WORDS separated by SPACES | |||
* But could also be NUMBERS separated by COMMAS | |||
* Or, other stuff... | |||
Example: file with 5 numbers | |||
* Read in the first 5 numbers | |||
* Cumulative sum of first 5 numbers | |||
* don't forget the throws | |||
Output: | |||
* Program outputs sum as 337.19999999 instead of 337.2 | |||
Utilize scanner functions: | |||
* Scanners have next() and nextDouble() and etc to read next values | |||
Structure of files: | |||
* Computer sees a one-dimensional stream of characters: everything else is our own invention (e.g., line breaks are ignored so computer doesn't even see lines) | |||
* Scanner handles details of, e.g., what to do when it gets to a newline char or a number char | |||
Exceptions from wrong data type: | |||
* InputMismatchException | |||
* Pay close attention to errors: not clear, but provide you with hints | |||
Moving through a file: | |||
* Comptuer sees 1D stream of text | |||
* Can't jump around - like a VCR tape | |||
* So, current location/position is important (input cursor) | |||
* Cursor moves down one char at a time | |||
* Scanner handles details: | |||
** nextFloat() knows what to look for | |||
** advances cursor to next word | |||
Scanner object info: | |||
* if we repeatedly call Scanner, it doesn't reset the cursor | |||
* one scanner --> one File, one position | |||
* processTokens(input,2) --> first 2 tokens | |||
* processTokens(input,3) --> processes tokens 3, 4, and 5 (not 1, 2, 3) | |||
etc. | |||
Paths and directories: | |||
* | |||
=Flags= | =Flags= | ||
{{CSC142Flag}} | {{CSC142Flag}} | ||
Revision as of 10:39, 1 September 2016
Chapter 6: File Processing
Sections:
6.1 File reading basics
6.2 Token based processing
6.3 Line based processing
6.4 Advanced file processing
6.5 Case study: zip code lookup
Chapter 3 focused on a scanner for user input. Chapter 6 focuses on a scanner for file reading.
Many intro programming classes see this as a complicated topic, and Java doesn't make it easy. It's awkward, but it's manageable.
We will also explore exceptions relate to file processing.
(Python makes this a dream.)
with open('data.txt','r') as f:
lines = f.readlines()
Done.
Section 6.1: File Reading Basics
Definitions
Definitions;
- File
- File extension
- binary
- ASCII
- Checked exception
- Throws clause
Material
Examples of the deluge of data available:
- Landmark-project (earthquakes, pollution, baserball, history, weather, etc)
- Gutenberg - see ciphertexts
- ncbi.nlm.nih.gov - biological/genomic data
- IMDB
- Fedstats.gov
- US census
- World bank
- CIA world factbook
Files and file objects:
- Data stored on computer as files
- Files have extensions
- Files can be stored as text, or as binary
- To deal with a file, use a File object
- This provides various methods
- Java API lookup/reference
- Note: we aren't constructing a NEW FILE, we're constructing a new object to represent an existing file
Reading files with scanner:
- Useful methods of File objects: (see list)
- File object is like a pipe: doesn't care much about what kind of fluid flowing thru, or where it comes from
- File object is the delivery system
- You can then pas sthe File object into a scanner
- Again, scanner is like nozzle at end of pipe - does not care much about File type or details of File object, just like nozzle doesn't care about type of fluid
- Need to deal with potential problems; file not there
- Checked exception - like "check" in chess
- Must be dealt with (can't just say, ignore and keep going)
- To handle this exception, put the code that may cause the error into a throws clause
Throws clause: diapers for your code
More in throws/catch clauses:
- You're anticipating a particular kind of mess
- Like an if statement, for exceptions
- If we see this kind of exception, catch it this particular way
public static void main (String[] args) throws FileNotFoundException {
...
}
Other exceptions:
- If you reach the end of a file, then ask for more
- NoSuchElementException
A word on the correct way:
Scanner input = new Scanner(new File("hamlet.txt"))
versus the incorrect way:
Scanner input = new Scanner("hamlet.txt")
(Latter would be like saying, a file with the literal contents "hamlet.txt")
NOTE: This is overloading in action (Scanner can take multiple data types)
Section 6.2: Token-Based Processing
Definitions
Definitions:
- Token-baesd processing
- Input cursor
- Consuming input
- File path
- Current directory
Token - a single chunk of letters or character data
- Usually WORDS separated by SPACES
- But could also be NUMBERS separated by COMMAS
- Or, other stuff...
Example: file with 5 numbers
- Read in the first 5 numbers
- Cumulative sum of first 5 numbers
- don't forget the throws
Output:
- Program outputs sum as 337.19999999 instead of 337.2
Utilize scanner functions:
- Scanners have next() and nextDouble() and etc to read next values
Structure of files:
- Computer sees a one-dimensional stream of characters: everything else is our own invention (e.g., line breaks are ignored so computer doesn't even see lines)
- Scanner handles details of, e.g., what to do when it gets to a newline char or a number char
Exceptions from wrong data type:
- InputMismatchException
- Pay close attention to errors: not clear, but provide you with hints
Moving through a file:
- Comptuer sees 1D stream of text
- Can't jump around - like a VCR tape
- So, current location/position is important (input cursor)
- Cursor moves down one char at a time
- Scanner handles details:
- nextFloat() knows what to look for
- advances cursor to next word
Scanner object info:
- if we repeatedly call Scanner, it doesn't reset the cursor
- one scanner --> one File, one position
- processTokens(input,2) --> first 2 tokens
- processTokens(input,3) --> processes tokens 3, 4, and 5 (not 1, 2, 3)
etc.
Paths and directories:
Flags
| CSC 142 - Intro to Programming I Computer Science 142 - Intro to Programming I, South Seattle College.
Chapter 1: Intro to Java CSC 142/Chapter 1 Chapter 2: Primitive Data and Definite Loops CSC 142/Chapter 2 Chapter 3: Parameters and Objects CSC 142/Chapter 3 Chapter 4: Conditional Execution CSC 142/Chapter 4 Chapter 5: Program Logic and Indefinite Loops CSC 142/Chapter 5 Chapter 6: File Processing CSC 142/Chapter 6 Chapter 7: Arrays CSC 142/Chapter 7 Chapter 8: Classes CSC 142/Chapter 8
Puzzles: Puzzles
Category:Teaching · Category:CSC 142 · Category:CSC Related: CSC 143 Flags · Template:CSC142Flag · e |