Lecture Note
University
University of California San DiegoCourse
DSC 207R | Python for Data SciencePages
2
Academic year
2023
anon
Views
22
Mastering UNIX Commands in Jupyter Notebooks UNIX shell commands must occasionally be used while working with data to carry out tasks like copying, moving, removing, or even processing data. Although though we can run thesecommands in Python, Jupyter Notebooks offer a more interactive and user-friendly way to doso. In this tutorial, we'll look at how to use Jupyter Notebooks to run UNIX shell commands. Using UNIX commands in Jupyter Notebooks Before we get started with using UNIX commands in Jupyter Notebooks, let's quickly overview how to use UNIX commands through a Jupyter Notebook. We use an exclamationmark before the commands to execute them, like we execute them on a UNIX shell. Jupyterwill use your default shell to execute these commands, so any adjustments needed toexecute these commands should be based on the operating system your Jupyterenvironment is set up on. Executing Useful Commands for Data Scientists Let's now go to our notebook and run a few commands that data scientists will find handy. The ls command will be our first step. We use the ls command with an exclamation mark todisplay the contents of the data directory called UNIX in the same directory as this notebook.You should have a.unix folder in your victory folder. A data file named shakespeare.txt canbe found in this folder. To store the name of this data file, we'll use a local variable in the notebook the way we would normally use in a Python script. We can display the value stored by the filenamevariable using the UNIX way and use the echo command in UNIX with an exclamation markin front of it, or simply use the print function in Python. Next, we'll display the first and last few lines of the file to get a basic understanding of the file header and footer. For this, we'll use the head and tail commands. We can execute thehead command with an exclamation mark and -n 3 to display the top three lines and$filename for that UNIX variable resolution. To display the bottom 10 lines, we use the tailcommand with an exclamation mark. Understanding the File Size It's important to understand the file size when working with data files. To display the number of words, lines, and characters, we can use the wc command. By executing !wc $filename,we can see the number of lines, words, and characters in the file. If we want to display justthe number of lines, we can use wc with the -l option.
Using Pipes and Filters In Jupyter Notebooks, we can also utilize pipelines and filters to handle data using UNIX commands. For data processing, pipes and filters make it possible to pass the results of onecommand as the input for another. In order to count the number of lines, words, andcharacters in a file, we can use the wc command after using the cat command to display afile's contents. The pipe | character can be used to transmit the output of cat to wc whenrunning this command. Here is how the command would appear:!cat $filename | wc. Conclusion In conclusion, UNIX shell commands can be used in Jupyter Notebooks to handle data. Jupyter Notebooks will use your normal shell to carry out these instructions when we use anexclamation point before the command.
Mastering UNIX Commands in Jupyter Notebooks
Please or to post comments