Description: Use Google BERT on fake_or_real news dataset with best f1 score: 0.986

## Showcase

### 1. Pipeline

First, we got the raw text with title, text and label. Then we use some methods of data processing to operate the text. After the data processing, we put them into the Bert model to train the data, which includes the Bert itself and the Classifier, here I used the feed-forward neural network and add a softmax layer to normalize the output. In the end, we got the predication and other details.

### 2. Part1: Data processing

(1) Drop non-sentence

• Type1: http[s]://www.claritypress.com/LendmanIII.html
• Type2: [email protected]
• Type3: @EP_President #EP_President
• Type5: ☮️ 💚 🌍 etc

(2) EDA methods

• Insert word by BERT similarity (Random Insertion)
• Substitute word by BERT similarity (Synonym Replacement)

AS for the first part, I use two methods: drop non-sentence and some EDA methods. I read some text within the fake_or_real news and I find that it contains various type of non-sentence, so I use the regular expression to drop them. And then, I use random insertion and synonym replacement to augment the text.

### 3. Part2: Bert Model

As for the second part, we put the text which we got from the first part into the bert model. The Bert model uses 12 encode layers and finally classifier to get the output.

### 4. Part3: Result

In the end, we combine different methods of data processing and u can see the f1 score from the chart. We get the best f1 score(0.986) from Cased text + drop sentence.

I learn the EDA from the two web site and through two articles, I learn that we shouldn’t remove Stopwords which otherwise will destroy the context of sentence. The end is implementation of BERT with Pytorch and the Bert model I learned.

## Implementation

### 4. Final output

#### 4.2 F1 and other details

