Why Batch Convert Excel Files?
If you regularly work with data, you've probably encountered a situation where you need to convert multiple Excel files into different formats — JSON for your web API, CSV for data imports, SQL for database loading, or XML for enterprise integrations. Doing this one file at a time is tedious, error-prone, and a massive waste of productivity. Batch conversion automates this workflow, saving hours of manual work and ensuring consistency across all your output files.
In enterprise environments, batch conversion is even more critical. ETL (Extract, Transform, Load) pipelines often start with Excel files from various departments — finance sends spreadsheets in one format, HR in another, and operations in yet another. Converting all of these into standardized formats for your data warehouse is a daily operational task. This guide covers every approach, from simple browser-based tools to Python and Node.js automation scripts.
Method 1: Browser-Based Batch Conversion
The simplest approach for occasional batch conversions is using a browser-based tool like ConvertMatrix. While each conversion is done individually, the speed of drag-and-drop conversion makes it practical for small batches of 5-20 files.
Workflow
- Open the appropriate converter page (e.g., Excel to JSON)
- Drag and drop your Excel file onto the upload zone
- Adjust conversion options (indentation, table names, etc.)
- Click Convert and download the output
- Repeat for each file
The advantage of this method is zero setup — no software installation, no scripting, and complete privacy since all conversion happens in your browser. The downside is that it doesn't scale well beyond 20 or so files, and you can't automate it into a recurring pipeline.
Method 2: Python Batch Conversion with pandas
For programmatic batch conversion, Python with the pandas and openpyxl libraries is the gold standard. Here's a complete script that converts all Excel files in a directory to multiple formats:
import pandas as pd
import json
import os
import glob
from pathlib import Path
def batch_convert(input_dir, output_dir, formats=None):
"""Convert all Excel files in a directory to specified formats."""
if formats is None:
formats = ['csv', 'json', 'sql']
# Create output directories
for fmt in formats:
Path(os.path.join(output_dir, fmt)).mkdir(parents=True, exist_ok=True)
# Find all Excel files
excel_files = glob.glob(os.path.join(input_dir, '*.xlsx'))
excel_files += glob.glob(os.path.join(input_dir, '*.xls'))
print(f"Found {len(excel_files)} Excel files")
for filepath in excel_files:
filename = Path(filepath).stem
print(f"Converting: {filename}")
try:
# Read all sheets
xl = pd.ExcelFile(filepath)
for sheet_name in xl.sheet_names:
df = pd.read_excel(xl, sheet_name=sheet_name)
suffix = f"_{sheet_name}" if len(xl.sheet_names) > 1 else ""
# CSV output
if 'csv' in formats:
csv_path = os.path.join(output_dir, 'csv', f'{filename}{suffix}.csv')
df.to_csv(csv_path, index=False)
# JSON output
if 'json' in formats:
json_path = os.path.join(output_dir, 'json', f'{filename}{suffix}.json')
df.to_json(json_path, orient='records', indent=2)
# SQL output
if 'sql' in formats:
sql_path = os.path.join(output_dir, 'sql', f'{filename}{suffix}.sql')
generate_sql(df, f'{filename}{suffix}', sql_path)
# XML output
if 'xml' in formats:
xml_path = os.path.join(output_dir, 'xml', f'{filename}{suffix}.xml')
df.to_xml(xml_path, index=False)
# Markdown output
if 'markdown' in formats:
md_path = os.path.join(output_dir, 'markdown', f'{filename}{suffix}.md')
df.to_markdown(md_path, index=False)
except Exception as e:
print(f" Error: {e}")
print("Batch conversion complete!")
def generate_sql(df, table_name, output_path):
"""Generate SQL INSERT statements from DataFrame."""
clean_name = table_name.replace(' ', '_').replace('-', '_')
columns = df.columns.tolist()
with open(output_path, 'w') as f:
# CREATE TABLE
f.write(f"CREATE TABLE `{clean_name}` (\n")
col_defs = [f" `{col}` VARCHAR(255)" for col in columns]
f.write(",\n".join(col_defs))
f.write("\n);\n\n")
# INSERT statements
for _, row in df.iterrows():
values = [str(v).replace("'", "''") if pd.notna(v) else 'NULL' for v in row]
values_str = ", ".join([f"'{v}'" if v != 'NULL' else v for v in values])
f.write(f"INSERT INTO `{clean_name}` ({', '.join([f'`{c}`' for c in columns])}) VALUES ({values_str});\n")
# Usage
batch_convert(
input_dir='./excel_files',
output_dir='./converted',
formats=['csv', 'json', 'sql', 'xml', 'markdown']
)
Installing Dependencies
pip install pandas openpyxl tabulate lxml
This script handles multiple sheets per workbook, creates organized output directories, generates proper SQL with CREATE TABLE statements, and includes error handling for malformed files. You can easily extend it to support additional output formats.
Method 3: Node.js Batch Conversion
For JavaScript developers or when you need to integrate with Node.js build pipelines, the xlsx library provides excellent Excel parsing capabilities:
const XLSX = require('xlsx');
const fs = require('fs');
const path = require('path');
function batchConvert(inputDir, outputDir, formats = ['csv', 'json']) {
// Create output directories
formats.forEach(fmt => {
const dir = path.join(outputDir, fmt);
fs.mkdirSync(dir, { recursive: true });
});
// Find Excel files
const files = fs.readdirSync(inputDir)
.filter(f => /\.xlsx?$/i.test(f));
console.log(`Found ${files.length} Excel files`);
files.forEach(file => {
const filePath = path.join(inputDir, file);
const baseName = path.parse(file).name;
console.log(`Converting: ${baseName}`);
const workbook = XLSX.readFile(filePath);
workbook.SheetNames.forEach(sheetName => {
const sheet = workbook.Sheets[sheetName];
const suffix = workbook.SheetNames.length > 1 ? `_${sheetName}` : '';
// CSV
if (formats.includes('csv')) {
const csv = XLSX.utils.sheet_to_csv(sheet);
fs.writeFileSync(
path.join(outputDir, 'csv', `${baseName}${suffix}.csv`),
csv
);
}
// JSON
if (formats.includes('json')) {
const json = XLSX.utils.sheet_to_json(sheet);
fs.writeFileSync(
path.join(outputDir, 'json', `${baseName}${suffix}.json`),
JSON.stringify(json, null, 2)
);
}
// HTML
if (formats.includes('html')) {
const html = XLSX.utils.sheet_to_html(sheet);
fs.writeFileSync(
path.join(outputDir, 'html', `${baseName}${suffix}.html`),
html
);
}
});
});
console.log('Batch conversion complete!');
}
// Usage
batchConvert('./excel_files', './converted', ['csv', 'json', 'html']);
Method 4: Command-Line Tools
Using LibreOffice
LibreOffice can batch convert Excel files via the command line without a GUI:
# Convert all .xlsx files to CSV
for file in *.xlsx; do
libreoffice --headless --convert-to csv "$file"
done
# Convert to PDF
for file in *.xlsx; do
libreoffice --headless --convert-to pdf "$file"
done
Using csvkit
The csvkit Python package provides powerful command-line tools for data conversion:
# Install csvkit
pip install csvkit
# Convert Excel to CSV
in2csv input.xlsx > output.csv
# Convert all Excel files
for file in *.xlsx; do
in2csv "$file" > "${file%.xlsx}.csv"
done
# Convert CSV to JSON
csvjson input.csv > output.json
# Convert CSV to SQL
csvsql --insert --tables my_table input.csv > output.sql
Method 5: Scheduled Automation
For recurring batch conversions, set up automated workflows using cron jobs (Linux/Mac) or Task Scheduler (Windows):
Linux Cron Job Example
# Edit crontab
crontab -e
# Run batch conversion every day at 2 AM
0 2 * * * /usr/bin/python3 /opt/scripts/batch_convert.py >> /var/log/batch_convert.log 2>&1
GitHub Actions Workflow
name: Batch Convert Excel
on:
push:
paths: ['data/excel/**']
schedule:
- cron: '0 6 * * 1' # Every Monday at 6 AM
jobs:
convert:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- run: pip install pandas openpyxl
- run: python scripts/batch_convert.py
- uses: actions/upload-artifact@v4
with:
name: converted-data
path: converted/
Handling Edge Cases
Multi-Sheet Workbooks
When an Excel file contains multiple sheets, you have several strategies: convert each sheet as a separate file (most common), merge all sheets into one output file, or only convert specific sheets. The Python and Node.js scripts above handle this by creating separate output files with the sheet name as a suffix.
Large Files
For Excel files larger than 50MB, consider using streaming approaches. The openpyxl library supports read-only mode (load_workbook(filename, read_only=True)) that reads data without loading the entire file into memory. Similarly, pandas supports chunked reading with the chunksize parameter.
Mixed Data Types
Excel columns can contain mixed data types (numbers, dates, strings, formulas). When converting to strongly-typed formats like SQL or Protobuf, you'll need to infer or specify column types. The Python script defaults to VARCHAR(255) for simplicity, but production pipelines should implement proper type detection.
Performance Comparison
| Method | Speed (100 files) | Setup Time | Automation | Best For |
|---|---|---|---|---|
| Browser-based | ~30 min | None | No | Quick one-offs |
| Python pandas | ~2 min | 15 min | Yes | Data science workflows |
| Node.js xlsx | ~3 min | 10 min | Yes | Web dev pipelines |
| LibreOffice CLI | ~5 min | 5 min | Yes | Server-side conversion |
| csvkit | ~4 min | 5 min | Yes | CLI power users |
Conclusion
Batch converting Excel files doesn't have to be a manual ordeal. Whether you choose a browser-based approach for quick one-off conversions using ConvertMatrix, Python scripts for data science workflows, Node.js for web development pipelines, or command-line tools for server automation, there's a solution that fits your needs. The key is choosing the right tool for your volume, frequency, and technical requirements.
Start with browser-based conversion for small batches, graduate to scripting when you need automation, and consider CI/CD integration when batch conversion becomes part of your regular data pipeline. With the right approach, you can process hundreds of files in minutes rather than hours.
Try Our Free Conversion Tools
Put what you've learned into practice with our browser-based converters: