How to install CentOS 7 using the GUI in Virtual Box

It took me a while to get this installation works on my machine. First, I have to install the VirtualBox in my machine. Previously, I was using VMware Workstation in my machine. The installation of the VirtualBox has been completed previously before installing the CentOS 7.

Setup new VM

When you launch the VirtualBox, it looks empty as above or you may have other virtual machines (VM) set up before. I need to set up a new VM by clicking at the “New” button. Next, follow through the wizard, a guided mode to set up the new VM. If you are familiar with using VirtualBox, you can use the Expert Mode to set up.

There are few things to be done in the above screen.

  • Name the VM
  • Set the folder path.
  • Select Type: Linux
  • Version: Red Had (64-bit)

Note: Since CentOS is the clone of Red Hat and it uses the similar architecture.

In this screen, it allocates the amount of memory to the virtual machine. In my set up, I leave it as default. You can allocate more memory if you have enough memory in your machine.

The above screen, it chooses to create a virtual disk (vdi) and proceed to create.

Choose the storage size on the physical hard disk. There are two options:

Fixed-size of the disk is not recommended in any scenario because you will be downloading many packages to run various applications.

Dynamically allocated will use space on the hard disk as it fills up. Select the dynamically allocated, make sure that your hard drive has enough free space. The 15GB space is sufficient enough to start with.

Click “Next” to proceed.

Click “Create” to proceed and finish the set up. Upon successfully created the virtual machine, the screen appears as below:

You can run the virtual machine now… You need to download the CentOS ISO image and link up the image with the newly created virtual machine.

Where to download the ISO image?

I downloaded the ISO image from this link. The download may take a while to complete due to the file size. The file comes with a .iso file extension.

Link up ISO image with VM

From the screenshot above, click on the “Setting” button and go to “Storage”. Under the optical drive (Empty), select the ISO image (.iso) file that you have downloaded earlier. Also, you need to enable the network adapter so that it can use the internet to download the required packages.

Start the virtual machine

Click on the “Start” button to start the virtual machine. There are different option of running the virtual machine in the VirtualBox. Select the option “Install CentOS Linux 7” and proceed to install. Again, it will take a while to load the required packages to complete the installation.

Once it is ready, you will see the opening screen of the server. It requires basic information to set up the server such as language, timezone and user account. You can set up accordingly.


Introduction to Kafka

Back in mid-March, I told my colleague that I wanted to continue my learning even after my Specialized Diploma in Business Analytics. I shared further that I wanted to learn about Apache Kafka. I wanted to know more about data technology from the infrastructure side. It is an area that I less explored. Big Data has been in our life for years now, and I have not gotten any practical experience from my work life. If I want to look for a job related to Big Data Analytics and Big Data Engineering, this exploration gives me a fundamental understanding and prepares myself for the interviews.

To begin, I read a few articles from the Internet and gained my first idea of Apache Kafka. The idea came about when we have multiple source systems and target systems, and the integrations need to write many different configurations. Each of these configurations comes with troublesome areas in:

  • Protocol – how the data is transported (example: HTTP, REST, TCP, etc).
  • Data format – how the data is parsed (example: CSV, JSON, binary, etc).
  • Data schema – how the data is shaped and may change.

Each source system may have an increased load from the connections.

Why Apache Kafka?

Decoupling the data streams & systems.

It is a highly scalable, fault-tolerant, publish-subscribe messaging system that enables you to develop distributed applications. If you are unfamiliar with Kafka, here I am going to share the knowledge that I gained when I read some of the articles online.

What is Apache Kafka?

Apache Kafka is a high-throughput distributed messaging system (or streaming platform). It means it supports millions of messages in our modest hardware and scalable means there is no downtime. It was created by LinkedIn, and it is an open-source project maintained by Confluent.

You can have any data streams from websites, micro-services, financial transactions, and etc. Once it is in Kafka, you may want to put the data into your databases, analytics system, email system, and etc.

Kafka is used for these broad classes of applications:

  • Building real-time streaming data pipelines that reliably get data between systems or applications.
  • Building real-time streaming applications that transform or react to the streams of data.

Kafka Concepts

  • Kafka runs on a cluster on one or more servers that can span multiple data centres.
  • The Kafka cluster stores streams of records in categories called topics.
  • Each record consists of a key, a value, and a timestamp.

The key to Kafka is the log data structure. It is different than the application logs. Remember, I mentioned the Kafka concept – each record consists of a key, value and timestamp; this refers to the log data structure. Databases write change events to a log and derive the value of columns from that log. In Kafka, messages are written to a topic that maintains this log (or multiple logs — one for each partition) from which subscribers can read and derive their representations of the data. In a simple way, you can think it is an “activity” log.

What Kafka does not do?

According to the, the Kafka does not have individual message IDs. Messages are simply addressed by their offset in the log. Kafka does not track the subscribers that a topic has or who has consumed what messages. All of that is left up to the subscribers.

I will cover a bit more on producers, topics and subscribers in the other blog entry.

Kafka and Big Data

With Kafka’s ability to ingest and move a large amount of data very quickly makes companies such as Netflix started to use Kafka. Netflix uses Kafka to dump data into Amazon S3 and used Hadoop to run batch video streams analytics, UI (user interface) activities, performance events, and diagnostic events to help drive feedback about user experience. It has paired Kafka with streaming stacks like Apache Spark and Apache Samza to route data. It loads it into back-end data stores like ElasticSearch and Cassandra, as well as directly into real-time analytics engines.

Conclusion, it depends on what use case should you use Kafka because it has its own benefits and don’ts too.


VirtualBox Installation

I have installed the VirtualBox in my machine recently, and I planned to write about the installation step for Windows machine. It is easy with the wizard and follow through the steps during the installation. First, download the installer from the website. I am using this link.

Upon successfully downloaded, run the .exe file. The first view of the wizard looks as below. It depends on the version that you downloaded, the interface may look different. I am using version 6.1.4 for this installation.

Setup using wizard

Click “Next” to proceed.

Next, select the location where you want to install the program. I left it as default location. This screen shows the required disk space to install the software in your machine. Click “Next” to proceed.

Next, you can choose whether to create shortcuts on your machine. In my case, I chose to untick the checkboxes for create shortcuts on the desktop and Quick Launch. Click “Next” to proceed.

Then, it shows a warning page, you can just click “Next” to proceed and click “Install” to install the software on your machine. Make sure you allow the wizard to continue install the software on your machine when it prompts you messages.

Launch the VirtualBox

You can begin to use the VirtualBox once you have downloaded some of the images to run it here.

If your machine is running in Linux, you can install VirtualBox from this link. It has the command lines that install the software. Choose the correct Linux version to begin the installation.

Chong Qing Grilled Fish

It is located at Mosque Street, Chinatown, it is the same restaurant as the Bugis branch at Liang Seah Street. There was no queue on the day I visited the restaurant with two other friends. I made a reservation before that, and we got our table after 10 minutes of waiting. They provide some drinks at the entrance of the restaurant for waiting customers.

Grilled Fish with Mild Spiciness

We ordered a mild-level of spiciness for the grilled seabass with a few additional ingredients such as lotus root, enoki mushrooms, etc. Out of my expectation, the ingredients were added into the pot when it was being served. It would be better if they could serve the side dishes separately and allow us to add into the grilled pot as and when we like to avoid the ingredients over-cooked.

Stir-fried Clams

The stir-fried spicy clams are quite delicious and the level of spiciness kicked in well. The sauce went well with the white rice I ordered. The amount of clams is generous too.

Stir-fried Frog Meat

It was a disappointment to order this stir-fried either frog meat because it did not taste as good. It was plain spicy without any tastes and it was oily. It seemed like a malaxiangguo with two or three ingredients stir-fried together.

All the dished that we ordered are mild spiciness, but ate them at the same time made us felt the level of spiciness has increased to hot spiciness. And, that made me wanted to order a drink from their menu to ease the spiciness and the saltiness after eating the food.


I could not remember clearly what drink did I ordered. It could be a concoction of the Yakult with something else and tasted quite nice.

Address: 18 Mosque St, #01-01, Singapore 059498.

Breakthrough Cafe

It is located at the People Park Centre, opposite the Chinatown Point. It is easily accessible and located as it is facing the side of the Singapore State Court. It opens in the morning for dim sum as breakfast and serves lunches before it closes around 4pm, according to my colleague.

Four of us walked to this restaurant for lunch and we decided to share the dishes together.

We tried the pig trotters, it is not my favourite at all especially the amount of fat on top of the meat. So, I take half of the egg soaked in the vinegar. This sauce is rich in collagen and said to be protein-rich and helps with muscle strengthening and repair. No doubt it is one of the warm to your heart dishes.

My favourite is the sesame oil chicken topped with shredded fried gingers. According to two of my colleagues, they said to make these shredded fried gingers need a lot of effort and time to prepare and cook. The sauce is nice to mix with rice and the amount of meat inside this little claypot is generous enough for us to share.

The sesame oil chicken always sells fast and it would be good if you can be there early for lunch to avoid this dish being sold out.

The less promising dish is the curry fish with assorted vegetables. The sauce is a little diluted. All of us agreed that this curry needs more santan (coconut milk) to make them taste richer and the sauce thicker. They wish to make more pineapples too. However, the portion looks generous and a lot of vegetables.

My colleague recommends us to try the egg tarts as well. It is a Hong Kong style egg tart which sells like a hot cake.

Address: 101A Upper Cross Street, #01-02A-C, People’s Park Centre, 058358 Singapore.

Happy International Women’s Day

Happy International Women’s Day! This year theme is #EachforEqual. According to the IWD website, it wrote, “We can actively choose to challenge stereotypes, fight bias, broaden perceptions, improve situations and celebrate women’s achievements.”

An equal world is an enabled world.

International Women’s Day, 2020.

Milestones Achieved

On this beautiful day, I humble to share the recent milestones that I achieved over this one year. I hope this will inspire more women to go forward to achieve their goals in life.

AI4I (AI for Industry)

The AI Singapore, is a national programme in artificial intelligence (AI), set up to enhance Singapore’s AI capabilities to power our future digital economy, according to the LinkedIn website. I first heard about this programme was in Dec 2018 when a colleague shared.

At that point of time, two other colleagues were doing AI proof-of-concept projects for the company and one of them quite well in doing machine learning. I think it would be great to kick start the Python learning journey on my own before losing out. Since I was not involved in any Python projects, I think this is the best way for myself to pick up a new language.

I started the online learning and did a few modules in the first four months. Then, I took a six-month break from this programme to concentrate on my specialized diploma course.

And, I have gotten my certificate of completion from AI Singapore in February 2020.

Completed One-Year Volunteer Work with TechLadies

I began my first volunteer work, and I was chosen by TechLadies to assist in running the study groups for the year 2019. I proud to help out people, especially ladies, who keen to start coding and move into the tech industry.

The initial thought of doing the study group was closely related to the online course I got started. It would be great if I could meet up with other ladies who took the same online course and we could group up together to complete the course. However, I realized it would be better to build the foundation well before pushing out new technologies or new programming languages to someone new to the tech industry.

Although I did not manage to run the Python study group during my 1-year volunteering, it gave me enough ideas on what worked and what did not work out well in TechLadies community when running a study group event.

Time fast-forwarded so fast, and I came to the end of my volunteering work with TechLadies and had my graduation dinner recently with a group of great people I worked for a year.

Specialized Diploma in Business Analytics

I do agree sometimes; a specialized diploma does not have the same weight as the actual diploma or degree. Before I started my course in mid-April, my colleagues have been telling me that I would be wasting my time attending the classes in polytechnic school. Appreciated their advice, I did not find it wrong if I were to think that the modules taught in polytechnic less detailed, technical and easy to score.

However, my first semester in business intelligence modules in Temasek Polytechnic, opened my eyes of using powerful tools such as Tableau and PowerBI to perform simple data analysis and data visualization easily. It does not need a technical person to use the tools.

There were class discussions among coursemates from different industries that tried to put the concepts learned in class into real work. I loved the interactions and the knowledge sharing session that nobody would do it in the office even I tried to cultivate this culture in my previous company. Indeed, it needed a lot of time, effort and encouragement to make it work and keep running.

Little that I knew, my specialized diploma course was not just about business intelligence; it upgraded the curriculum by including popular topics such as text analysis and machine learning. Indirectly, I completed some formal education in AI and machine learning through this course.

Studying helped me to identify what I wanted to be next. I gained experienced working on data visualization and next I wanted to use my SQL and Python skill for data management. Therefore, I switched from mastering data analyst skill to data management skill.

It is challenging and yet, fulfilling journey with Temasek Polytechnic, and I have completed and submitted my last project last week. I would never think this course easy, but it is not hard, just right to meet the basic needs and the rest is on my own.

Finally, Learning Agile

Before I continue, I am officially six months into my current role in my new workplace now. I enjoyed both the work and the people here who emphasize and demonstrate good teamwork. Although the technology stack over here is strictly limited to Microsoft products, we keep learning new technologies through sharing sessions.

My team decided to pick up agile and started to implement scrum methodology in our team. I am excited to hop onto this bandwagon with the rest of the project teams. There are many things to learn and adapt when it comes to agile. It gives me a chance to learn the proper way of working in a team. Someone told me to learn it!

Work Life Getting Busy, Yet I’m Happy

Many people will think this could only happen when we are working on something that we passionated. Yes, for now at least, the vision and mission are clear. The management included the project team members in the team planning last year November. The director and management team listened to ground staffs open discussion. Last week, during the knowledge sharing session, some of the planned actions have put in-placed. It has never been so well-planned in my previous work, and everyone is able to visualize the future.

Another news came to me. I have another project team to work with; in other words, I am handling two projects now. While there is a small restructuring happened within my team, right now, most of us are heading into their preferred direction for sure!

What is next, then? For real…

Looking into getting certifications

My boss introduced this certification to my project team during the knowledge sharing session. He is a DAMA certified and actively contribute to Data Management within the project teams and organization. The data governance is not fancy work, and most of the time, it will not implement in the project due to costs and time. To add on, not many people understand the needs of it.

Good enough to have just a handful of people who look into this seriously, and I benefited from taking up my current role to maintain the data governance. I think pursuing this certification is a good investment if I keen to concentrate on data management and data governance work in future.

How can I put my learning in good use?

My coursemate inspired me to start the brown bag programme. I need a group of regulars, a combination of both experienced and inexperienced people to group regularly to learn something in an hour and share the finding in the next hour. It increases the interaction between community members.

I did not have an exact plan on how can I execute this programme, and whether I want to try within my department or project team or collaborate with tech communities. Any other ideas to conduct the brown bag session?

I intend to give more sharing about my Python learning journey through my technical blog in Medium or teaching session (if they do not mind having a newbie to teach). In return, I hope to meet some regulars that are actively using Python for data analysis and AI learning. Also, I want to engage with them proactively to build a two-way communication. It helps to keep me learning the language, although I am not using it at work.

Mentorship Programme

Besides Python, I wish to pick up some skills on Apache project big family such as Apache Kafka for big data, Apache Spark for machine learning, etc. In other words, I am looking into learning open source technology, continuous learning from where I left off six months ago. I planned to put this into my mentorship programme. My mentor is a person who used to head a department in NUS.

What else I can do?

Besides what I mentioned above, what other options available that I can explore and try out. I am open to suggestions.