“Dettererererere, dettererererere, if there’s something strange~, in the neighbourhood~, who do you call~?”
“”G-Ghostbusters…!””
Inoue and I were in trouble because Saito was in high spirits as he asked for a call and response. However, it was no wonder that Saito, a movie buff, was excited in this situation.
The three of us had been given a machine to wear on our backs with a gun like tool connected to it via a cable. Saito called it a beam pack.
And, so-called ghosts were flying around the sheet. A green, blobby fat one. A ghost in tattered cloths with yellow glowing eyes. A translucent runner with legs that kept running at full speed despite being a ghost. There were ghost of various shapes and sizes. These guys all had different appearances, but it seemed like they were all IMPORTXML’s.
##########################
“Today’s quest is a web function.”
It was my first day back after being hospitalised by the DPRODUCT.
“You remember the IMAGE function, right?”
“Yes, it’s a function that fetches an image from a URL and then displays it.”
“What we’re dealing with today is are functions that retrieve information from the web via a URL as an argument. Inoue, explain the rest.”
“Saito-san, haven’t you been giving me too many things to explain recently?”
“No, no, you know more than me in this area after all.”
Inoue seemed to be in a good mood as she began to explain.
“I guess so, I’ll start by explaining the things that are easy to understand then.”
Inoue wrote out the syntax onto the whiteboard.
=IMPORTRANGE(Spreadsheet Key, Range String)
“The Spreadsheet Key, does that mean that IMPORTRANGE imports data from other sheets?”
“That’s right. Whether it’s your own sheet or someone else’s sheet, you can import the data as long as you have viewing privileges. It’s convenient and easy to use, so you can see it in many other places aside from engineering sheets. If you use Google Apps at work, it can be used by other employees to refer to a sheet of data that was compiled by someone else.”
“I used to reference across files in Excel, but isn’t it inconvenient if the folder or file names have to change?”
“This is true for Excel, but once we create a file URL with Google Spreadsheets, it stays the same even if the folder, file name, or owner changes. That kind of thing is unlikely to happen.”
I see, it does seem convenient. If my company had been able to implement Google Apps while I was still alive, I’m sure it would have made things a little easier. Inoue continued to explain.
=IMPORTDATA(URL)
“IMPORTDATA has a fancy name, but its function is very specific. It is a function that imports CSV or TSV data from the web. You know what CSV and TSV files are, right?”
“Those are Comma Separated Values and Tab Separated Values data. I often downloaded them from various systems and pasted them into Excel.”
“That would have been the case at Takahashi-san’s previous job. IMPORTDATA is a function that allows you to download CSV or TSV files from the web and import them without having to paste it in.”
“Do such CSV files exist? In my industry, I had to log in to download most files, so it doesn’t give me the impression of being useful.”
“You often see it in government statistics. However, I don’t see this function often myself because many CSV’s aren’t clean.”
I see, I do remember seeing some discussion on the internet about how government CSV statistics were difficult to use.
“Now then, the other IMPORT’s may be a little harder to understand.”
Inoue wrote the syntax of another function.
=IMPORTHTML(URL, Query, Index)
“Query? Is it SQL again?”
“No, when using a query with the IMPORT function, you have to specify the type of information to retrieve and there are a few options. With IMPORTHTML, you only have ‘Table’ or ‘List’, so it’s easy.”
Inoue started writing examples that included specific URLs.
=IMPORTHTML(“https://kakuyomu.jp/works/1177354054887646455”, “list”, 7)
“This is an example of retrieving a list of chapter titles from a certain work on a certain novel submission site. If the data is surrounded by <li> tags, it will be retrieved by the ‘list’ option. The 7 means that you’re looking for the 7th list element in the HTML of that page. By the way, if you ran that as is, you would be able to import a list of chapters up to the current 60th chapter of the series.”
“What certain work on a certain novel submission site?”
“Next is IMPORTXML.”
Eh? Why was I ignored?
=IMPORTXML(URL, XPath Query)
“Although it’s called XML, it’s actually often used to extract specific elements from normal HTML web pages.”
=IMPORTXML(“https://kakuyomu.jp/works/1177354054887646455”, “//*[@id=’workPoints’]/a/span”)
“Using the same novel from earlier, if you write it like this, you can get the number of ★ reviews as a numerical value. Currently, there are 125 ★’s, but I hope it gets more ★’s.”
“So what is that novel!?”
“By the way, XPath can be easily extracted using the browser’s developer tools. However, if double quotes are included, it will interfere with the function string enclosure, so you should either change it to single quotes or reference it from another cell. It is also a good idea to write it down for future reference.”
Why was I being ignored? Did I ask about something I shouldn’t have?
“IMPORTXML is a very useful function for web crawling and you can also used it to extract the same elements from multiple pages and compare them on a sheet. For example, you could compare the ★’s of multiple novels side by side on a sheet. Beacuse of that, it often appears in groups compared to other web functions.”
I understood the explanation, but it still didn’t make sense. Why was that? Inoue ignored me as she kept explaining.
=IMPORTFEED(URL, [Query], [Header], [Number of Items])
“Finally, there’s IMPORTFEED. This is for reading RSS, Atom, and content formats from blogs and news sites. Do you use an RSS reader?”
“Mmm, I used to use one back in university. At some point I switched to using news apps and Twitter, so I stopped using it.”
“If you’ve used it before, you should be fine. It’s the same RSS.”
=IMPORTFEED(“https://kakuyomu.jp/info/feed”,”items”,TRUE, 5)
“This is an example for getting notifications from a certain novel submission site.”
Here it is again.
“The second argument, Query, defaults to ‘items’, but you can also select ‘feed’. But, since that only obtains simple information about the feed itself, most of the time, ‘item’ is used. The argument after that is whether to display the headline or not. As for ‘Number of Items’, that specifies how many articles to display from the feed. That’s all. Do you understand?”
I understood the explanation of the functions, but I didn’t feel like responding. Then, the ghosts appeared.
※Functions
IMPORTFEED https://support.google.com/docs/answer/3093337
IMPORTHTML https://support.google.com/docs/answer/3093339
IMPORTDATA https://support.google.com/docs/answer/3093335
IMPORTXML https://support.google.com/docs/answer/3093342
IMPORTRANGE https://support.google.com/docs/answer/3093340