I Implemented bert4keras Myself

By 苏剑林 | August 27, 2019

Sharing my personal implementation of bert4keras:

https://github.com/bojone/bert4keras

This is my re-implementation of BERT for Keras, dedicated to implementing BERT calls under Keras using code that is as clean and simple as possible.

Explanation

The basic implementation of BERT is now complete, and it successfully loads official weights. It has been verified that the model output is consistent with keras-bert, so you can use it with confidence.

The original intention of this project was to provide convenience for modification and customization, so it may be updated frequently.

Therefore, stars are welcome, but I do not recommend forking, as the version you fork may quickly become outdated.

Usage

Quick Installation:

pip install bert4keras

Reference Code:

from bert4keras.models import build_transformer_model
from bert4keras.tokenizers import Tokenizer
import numpy as np

config_path = '/path/to/bert_config.json'
checkpoint_path = '/path/to/bert_model.ckpt'
dict_path = '/path/to/vocab.txt'

tokenizer = Tokenizer(dict_path, do_lower_case=True)  # Create tokenizer
model = build_transformer_model(config_path, checkpoint_path) # Build model and load weights

token_ids, segment_ids = tokenizer.encode(u'语言模型')
print(model.predict([np.array([token_ids]), np.array([segment_ids])]))

The examples previously provided in "When BERT Meets Keras: This Might Be the Simplest Way to Open BERT" based on keras-bert are still applicable to this project; you only need to replace the base_model loading method with the one from this project.

Currently, guaranteed support is for Python 2.7, with an experimental environment of TensorFlow 1.8+ and Keras 2.2.4+.

(Some friends have tested it and found that Python 3 works directly without errors, so Python 3 users can try it out. However, I haven't tested it myself, so I cannot guarantee it.)

Of course, friends who are willing to contribute are welcome to point out bugs, suggest corrections, or even submit Pull Requests!

Background

Previously, I had been using keras-bert by CyberZHG. If the goal is purely to call and fine-tune BERT within Keras, keras-bert is already quite satisfactory.

However, if one wants to modify the internal structure of BERT while loading official pre-trained weights, then keras-bert becomes somewhat difficult to meet those needs. To maintain code reusability, keras-bert encapsulates almost every small module into a separate library. For example, keras-bert depends on keras-transformer, which depends on keras-multi-head, and keras-multi-head depends on keras-self-attention. This multi-layered dependency makes modifications quite a headache.

Therefore, I decided to rewrite a Keras version of BERT, striving to implement it completely within a few files, reducing these dependencies while retaining the capability to load official pre-trained weights.

Acknowledgments

Thanks to CyberZHG for implementing keras-bert. This implementation refers to the source code of keras-bert in many places, and I would like to express my sincere gratitude for the selfless contribution.