Skip to content

Working with Data

Get your data ready for analysis with Probably. Learn about supported formats, preparation tips, and connection options.

Drag and drop any of these formats:

.csv

CSV Files

.xlsx

Excel Files

.json

JSON Data

.parquet

Parquet Files

For Best Results

  • Use descriptive column names (e.g., “customer_age” not “col1”)
  • Include dates in YYYY-MM-DD format when possible
  • Keep categorical data consistent (e.g., “Yes/No” not “Y/Yes/True”)
  • Don’t worry about missing values - Probably handles them automatically

Connect to live databases for real-time analysis:

  1. Click “Connect Database” on the welcome screen
  2. Select your database type (PostgreSQL, MySQL, SQLite, etc.)
  3. Enter connection details:
    Host: your-database-host.com
    Port: 5432
    Database: your_database_name
    Username: your_username
    Password: your_password
  4. Test connection and select tables to analyze
  • Unique column names: Avoid duplicate headers
  • Consistent data types: Numbers as numbers, dates as dates
  • Clear naming: Use underscores instead of spaces when possible
  • Meaningful names: “revenue_2024” instead of “data1”
  • Dates: ISO format (YYYY-MM-DD) works best
  • Numbers: Remove currency symbols ($, €) when possible
  • Categories: Use consistent spelling and capitalization
  • Text: UTF-8 encoding for international characters
  • Mixed data types in single columns
  • Headers spanning multiple rows
  • Embedded charts or formatting in Excel files
  • Password-protected or locked files
  • Load instantly
  • Full feature availability
  • Best for initial exploration
  • Optimized loading with progress indicators
  • All features available
  • Recommended to filter early for better performance
  • Streaming processing for memory efficiency
  • Consider using database connections instead
  • Sample data for initial exploration, then work with full dataset
File won’t load

• Check file format is supported (.csv, .xlsx, .json, .parquet)

• Ensure file isn’t corrupted or password-protected

• Try saving Excel files as CSV if having issues

Data not displaying correctly

• Check for proper column headers in first row

• Verify date formats are consistent

• Ensure numeric columns don’t contain text

• Remove empty rows at the beginning of file

Database connection fails

• Verify host address and port number

• Check username and password

• Ensure database allows external connections

• Check firewall settings

Learn Good Questions

Discover what types of questions work best with your data and the AI agent.

Explore Data Sources

Learn about all the different ways to get data into Probably for analysis.