I am trying to read date column from a csv file. This column contains dates in just one format. Please see data below:
The problem arises when I am trying to read it using dateparser.
dateparse=lambda x:datetime.strptime(x, '%m/%d/%Y').date() df = pd.read_csv('products.csv', parse_dates=['DateOfRun'], date_parser=dateparse)
Above logic works fine most of the cases, but sometimes randomly i get error that format is not matching, example below:
ValueError: time data ‘2020-02-23’ does not match format ‘%m/%d/%Y’
Does anyone know how is this possible? Because that yyyy-mm-dd format is not in my data.. ANy tips will be useful.
The problem happens when you open the csv file in Excel. Excel by default (and based on your OS settings) automatically changes the date format. For instance, in USA the default format is MM/DD/YYYY so if you have a date in a csv file such as YYYY-MM-DD it will automatically change it to MM/DD/YYYY.
The solution is to NOT open the csv file in Excel before manipulating it in Python. IF you must open it to inspect it either look at it in Python or in notepad or some other text editor.
I always assume that dates are going to be screwed up because someone might have opened it in Excel and so I test for the proper format and then change it if I get an AssertionError.
As an example if you want to change dates from YYYY-MM-DD try this:
from datetime import datetime def change_dates(date_string): try: assert datetime.strptime(date_string, '%m/%d/%y'), 'format error' return date_string except AssertionError, ValueError: dt = datetime.strptime(date_string, '%Y-%m-%d') return dt.strftime('%m/%d/%Y')