Project image

News article summarizer

About the project:

Smoosh is a text summarizer. It works on articles, books, or any text content. Smoosh currently exists as a CLI, but will hopefully be deployed as a web service or API in the future!

Smoosh's algorithm calculates the importance of each word in a given text, and attempts to summarize the text by printing only the sentences that it deems necessary to understand the text's meaning. It defaults to summarizing content into 7 sentences, but it can return as many as desired.

Smoosh can take either a .txt file or a URL to a website containing a news article. Don't waste time reading long news articles any more!

Technology used:


View on GitHub


smoosh summarizes text. It can run on active web pages or on static text content.

If you'd like to smoosh an article from the web, pass its URL to smoosh:

$ python3

And smoosh will output the smooshed result:

Trump And Kim Arrive In Singapore For Unprecedented Summit  Enlarge this image toggle caption Evan Vucci/AP Evan Vucci/AP  Updated at 11:47 a.m. ET  President Trump and North Korean leader Kim Jong Un arrived in Singapore Sunday ahead of a highly anticipated summit.  Aboard Air Force One en route to Singapore, Trump tweeted, "I look forward to meeting [Kim] and have a feeling that this one-time opportunity will not be wasted!"  Singapore Foreign Minister Vivian Balakrishnan shared a photo that showed him welcoming Kim Jong-Un to Singapore.  Enlarge this image toggle caption Singapore Ministry of Communications and Information/Getty Images Singapore Ministry of Communications and Information/Getty Images  President Trump has said he will use the summit, scheduled to begin Tuesday, to push for North Korea's denuclearization. Trump and Kim Jong-Un have both suggested they could also pursue a peace treaty to officially end the Korean War, which ceased because of an armistice signed in 1953."  Hammering out a complete plan for North Korea's denuclearization would be "an impossible chore" at this summit, Michael O'Hanlon of the Brookings Institution told NPR's Windsor Johnston."  Trump arrived in Singapore after another meeting with foreign leaders -- a group of close American allies.

-_-_-_-_-_-_ METRICS _-_-_-_-_-_-
Original length: 3409 characters
Smooshed length: 1299 characters

Original smooshed by 61.89%

Or, you can pass in a text file (the article I'm using can be found here):

$ python3 -i article.txt

And smoosh will output the smooshed result:

The expansion would enable SpaceX to store and refurbish large numbers of Falcon rocket boosters and nose cones at the operations center down the road from NASA's Vehicle Assembly Building.  "As SpaceX's launch cadence and manifest for missions from Florida continues to grow, we are seeking to expand our capabilities and streamline operations to launch, land and re-fly our Falcon family of rockets," said James Gleeson, a SpaceX spokesman. Here's what SpaceX has in mind to start:  The most eye-catching feature would be a 32,000-square-foot tower standing up to 300 feet tall, housing a "world-class, architecturally distinctive" launch and landing control center.  The site would replace or add to SpaceX's current launch and landing control center, which is tucked in a small office space outside the south gate to Cape Canaveral Air Force Station near Port Canaveral.  After landing on Cape Canaveral pads or SpaceX's "drone ship" at sea, recovered Falcon 9 and Falcon Heavy boosters would return to a 133,000-square-foot hangar standing up to 100 feet tall.  Rivaling the open-air exhibit of famous spacecraft at the nearby KSC Visitor Complex, SpaceX plans to display "historic space vehicles" in its own rocket garden, potentially including Falcon boosters or Dragon capsules staged vertically or horizontally. The SpaceX Operations Area would expand the company's KSC footprint beyond the hangar it built at the base of pad 39A, which can house three Falcon boosters.

-_-_-_-_-_-_ METRICS _-_-_-_-_-_-
Original length: 6035 characters
Smooshed length: 1480 characters

Original smooshed by 75.48%

You can also use the -v/--verbose option to get more statistics about your article

$ python3 --verbose -i article.txt


-_-_-_-_-_-_ METRICS _-_-_-_-_-_-
Original length: 6035 characters
Smooshed length: 1480 characters

Original smooshed by 75.48%

Most common words:
1. spacex (15 times)
2. would (14 times)
3. launch (12 times)
4. space (10 times)
5. falcon (10 times)
6. area (7 times)
7. nasa (7 times)

Most important sentences:
1. Sentence #4
2. Sentence #5
3. Sentence #12
4. Sentence #16
5. Sentence #17